There is only one way for this sequence to occur. For those bases to occur in that exact order. Where you make the cuts is irrelevant. That's the odds of that specific sequence occurring at any point. There's a 1 in 4 chance of having cytosine at any point in the genome. There's a 1 in 4 chance will be followed by thymine. There's a 1 in 4 chance that that will be followed by cytosine again, etc. What I gave was an absolute value; the odds of this specific sequence occurring anywhere.
Your calculation presupposes the genome length of the virus is exactly the length of the sequence, that's the mistake I'm pointing out. Think about it this way. Lets say I tell you that I've written down a random five letter word (each letter being completely random, that is). Now I ask you to calculate the odds of it being "horse". You would correctly say that that's one in 26*26*26*26*26. However let's say I give you a book comprised of random letters instead (the whole genome of the virus). Wouldn't you agree that the chance of it containing the word "horse" anywhere in it is much higher?
Now lets say I give you a whole dictionary of meaningful words and phrases (in our analogy that's all the patented genome sequences). I'm sure you understand that the probability of at least one word being contained in the book is even higher in that case. And now consider that in our problem the letters aren't random at all, because some genome sequences are more useful than others and the useful sequences are much more likely to be patented.
That's what makes calculating probabilities of such real world events so tricky. I'm not an expert either, and I wouldn't be surprised if my back-of-the-napkin calculation is quite off. But that's the sort of things you need to take into account to arrive at a reasonable conclusion.
I make no presupposition regarding the length of the genome. Read what I said again. Go to any point within the entire genome, you have a 25% chance of finding any particular nucleotide base. Pluck any string of 19 out of the genome; regardless of how long the entire genome is, this is the chance that it will match this sequence. That is the calculation I did. True, if there are multiple sets to choose from, that increases the odds of finding this sequence, but that isn't the calculation I did. I gave the chance that any single set of 19 matches. I never tried to represent it as anything else.
There is only one way for this sequence to occur. For those bases to occur in that exact order. Where you make the cuts is irrelevant. That's the odds of that specific sequence occurring at any point. There's a 1 in 4 chance of having cytosine at any point in the genome. There's a 1 in 4 chance will be followed by thymine. There's a 1 in 4 chance that that will be followed by cytosine again, etc. What I gave was an absolute value; the odds of this specific sequence occurring anywhere.
Your calculation presupposes the genome length of the virus is exactly the length of the sequence, that's the mistake I'm pointing out. Think about it this way. Lets say I tell you that I've written down a random five letter word (each letter being completely random, that is). Now I ask you to calculate the odds of it being "horse". You would correctly say that that's one in 26*26*26*26*26. However let's say I give you a book comprised of random letters instead (the whole genome of the virus). Wouldn't you agree that the chance of it containing the word "horse" anywhere in it is much higher?
Now lets say I give you a whole dictionary of meaningful words and phrases (in our analogy that's all the patented genome sequences). I'm sure you understand that the probability of at least one word being contained in the book is even higher in that case. And now consider that in our problem the letters aren't random at all, because some genome sequences are more useful than others and the useful sequences are much more likely to be patented.
That's what makes calculating probabilities of such real world events so tricky. I'm not an expert either, and I wouldn't be surprised if my back-of-the-napkin calculation is quite off. But that's the sort of things you need to take into account to arrive at a reasonable conclusion.
I make no presupposition regarding the length of the genome. Read what I said again. Go to any point within the entire genome, you have a 25% chance of finding any particular nucleotide base. Pluck any string of 19 out of the genome; regardless of how long the entire genome is, this is the chance that it will match this sequence. That is the calculation I did. True, if there are multiple sets to choose from, that increases the odds of finding this sequence, but that isn't the calculation I did. I gave the chance that any single set of 19 matches. I never tried to represent it as anything else.