Win / GreatAwakening
GreatAwakening
Sign In
DEFAULT COMMUNITIES All General AskWin Funny Technology Animals Sports Gaming DIY Health Positive Privacy
Reason: None provided.

Do you think there's value in that? If so, I may do it if I have some time.

It wouldn't be too difficult to do it empirically. Build a database of all Q drops & Trump tweets. Randomly sample pairs of the two, and label as related in content or not. Sample enough pairs to your desired level of accuracy.

The second step would be the most time consuming. Manual human labeling would be the most accurate, but it could be time prohibitive if you're sampling tens of thousands of pairs.

But an automated 'relatedness' score could assigned based off of edit distance weighted by the commonality of combinations of words occurring together. Also, a word meaning space could be trained a la Word2Vec which could pick up on similar meanings in words even when the words aren't exactly the same.

(I'm finishing up a Ph.D. in computer science and I'd much rather work on this than my own dissertation for a few days, lmao.)

But, my gut feeling is that none of this is necessary. Sticking around here long enough is enough to assure anyone, even without pinning it down to an exact number, that the odds of it being a coincidence are astronomically low. And I say that as someone who's only really dug in to Q for about a week, lmao.

3 years ago
1 score
Reason: None provided.

Do you think there's value in that? If so, I may do it if I have some time.

It wouldn't be too difficult to do it empirically. Build a database of all Q drops & Trump tweets. Randomly sample pairs of the two, and label as related in content or not. Sample enough pairs to your desired level of accuracy.

The second step would be the most time consuming. Manual human labeling would be the most accurate, but it could be time prohibitive if you're sampling tens of thousands of pairs.

But an automated 'relatedness' score could assigned based off of edit distance weighted by the commonality of combinations of words occurring together. Also, a word meaning space could be trained a la Word2Vec which could pick up on similar meanings in words even when the words aren't exactly the same.

(I'm finishing up a Ph.D. in computer science and I'd much rather work on this than my own dissertation for a few days, lmao.)

But, my gut feeling is that none of this is necessary. Sticking around here long enough is enough to assure anyone, even without pinning it down to an exact number, that the odds of it being a coincidence are astronomically low.

3 years ago
1 score
Reason: Original

Do you think there's value in that? If so, I may do it if I have some time.

It wouldn't be too difficult to do it empirically. Build a database of all Q drops & Trump tweets. Randomly sample pairs of the two, and label as related in content or not. Sample enough pairs to your desired level of accuracy.

The second step would be the most time consuming. Manually human labeling would be the most accurate, but it could be prohibitive if you're sampling tens of thousands of pairs.

But an automated 'relatedness' score could assigned based off of edit distance weighted be the commonality of combinations of words occurring together. Also, a word meaning space could be trained a la Word2Vec which could pick up on similar meanings in words even when they words aren't exactly the same.

(I'm finishing up a Ph.D. in computer science and I'd much rather work on this than my own dissertation for a few days, lmao.)

But, my gut feeling is that none of this is necessary. Sticking around here long enough is enough to assure anyone, even without pinning it down to an exact number, that the odds of it being a coincidence are astronomically low.

3 years ago
1 score