Your calculation presupposes the genome length of the virus is exactly the length of the sequence, that's the mistake I'm pointing out. Think about it this way. Lets say I tell you that I've written down a random five letter word (each letter being completely random, that is). Now I ask you to calculate the odds of it being "horse". You would correctly say that that's one in 26*26*26*26*26. However let's say I give you a book comprised of random letters instead (the whole genome of the virus). Wouldn't you agree that the chance of it containing the word "horse" anywhere in it is much higher?
Now lets say I give you a whole dictionary of meaningful words and phrases (in our analogy that's all the patented genome sequences). I'm sure you understand that the probability of at least one word being contained in the book is even higher in that case. And now consider that in our problem the letters aren't random at all, because some genome sequences are more useful than others and the useful sequences are much more likely to be patented.
That's what makes calculating probabilities of such real world events so tricky. I'm not an expert either, and I wouldn't be surprised if my back-of-the-napkin calculation is quite off. But that's the sort of things you need to take into account to arrive at a reasonable conclusion.
I make no presupposition regarding the length of the genome. Read what I said again. Go to any point within the entire genome, you have a 25% chance of finding any particular nucleotide base. Pluck any string of 19 out of the genome; regardless of how long the entire genome is, this is the chance that it will match this sequence. That is the calculation I did. True, if there are multiple sets to choose from, that increases the odds of finding this sequence, but that isn't the calculation I did. I gave the chance that any single set of 19 matches. I never tried to represent it as anything else.
Your calculation presupposes the genome length of the virus is exactly the length of the sequence, that's the mistake I'm pointing out. Think about it this way. Lets say I tell you that I've written down a random five letter word (each letter being completely random, that is). Now I ask you to calculate the odds of it being "horse". You would correctly say that that's one in 26*26*26*26*26. However let's say I give you a book comprised of random letters instead (the whole genome of the virus). Wouldn't you agree that the chance of it containing the word "horse" anywhere in it is much higher?
Now lets say I give you a whole dictionary of meaningful words and phrases (in our analogy that's all the patented genome sequences). I'm sure you understand that the probability of at least one word being contained in the book is even higher in that case. And now consider that in our problem the letters aren't random at all, because some genome sequences are more useful than others and the useful sequences are much more likely to be patented.
That's what makes calculating probabilities of such real world events so tricky. I'm not an expert either, and I wouldn't be surprised if my back-of-the-napkin calculation is quite off. But that's the sort of things you need to take into account to arrive at a reasonable conclusion.
I make no presupposition regarding the length of the genome. Read what I said again. Go to any point within the entire genome, you have a 25% chance of finding any particular nucleotide base. Pluck any string of 19 out of the genome; regardless of how long the entire genome is, this is the chance that it will match this sequence. That is the calculation I did. True, if there are multiple sets to choose from, that increases the odds of finding this sequence, but that isn't the calculation I did. I gave the chance that any single set of 19 matches. I never tried to represent it as anything else.