Do you think there's value in that? If so, I may do it if I have some time.
It wouldn't be too difficult to do it empirically. Build a database of all Q drops & Trump tweets. Randomly sample pairs of the two, and label as related in content or not. Sample enough pairs to your desired level of accuracy.
The second step would be the most time consuming. Manual human labeling would be the most accurate, but it could be time prohibitive if you're sampling tens of thousands of pairs.
But an automated 'relatedness' score could assigned based off of edit distance weighted by the commonality of combinations of words occurring together. Also, a word meaning space could be trained a la Word2Vec which could pick up on similar meanings in words even when the words aren't exactly the same.
(I'm finishing up a Ph.D. in computer science and I'd much rather work on this than my own dissertation for a few days, lmao.)
But, my gut feeling is that none of this is necessary. Sticking around here long enough is enough to assure anyone, even without pinning it down to an exact number, that the odds of it being a coincidence are astronomically low. And I say that as someone who's only really dug in to Q for about a week, lmao.
Can someone tell me if this is mathematically possible?
Do you think there's value in that? If so, I may do it if I have some time.
It wouldn't be too difficult to do it empirically. Build a database of all Q drops & Trump tweets. Randomly sample pairs of the two, and label as related in content or not. Sample enough pairs to your desired level of accuracy.
The second step would be the most time consuming. Manual human labeling would be the most accurate, but it could be time prohibitive if you're sampling tens of thousands of pairs.
But an automated 'relatedness' score could assigned based off of edit distance weighted by the commonality of combinations of words occurring together. Also, a word meaning space could be trained a la Word2Vec which could pick up on similar meanings in words even when the words aren't exactly the same.
(I'm finishing up a Ph.D. in computer science and I'd much rather work on this than my own dissertation for a few days, lmao.)
But, my gut feeling is that none of this is necessary. Sticking around here long enough is enough to assure anyone, even without pinning it down to an exact number, that the odds of it being a coincidence are astronomically low. And I say that as someone who's only really dug in to Q for about a week, lmao.
Okay doomer
Where was there any semblance of dooming in my comment that you responded to?
"that the odds of it being a coincidence are astronomically low."