PNC is right that the dude on Twitter sort of exaggerates the sample size issue by talking about individual years, but I think the guy’s point is that year by year you can’t really hope this is a representative sample of cops with such low numbers. But he didn’t make that clear, so PNC’s question is fair.

Still highly underpowered (I certainly wouldn't have written a paper based on it myself), but I don't think the issue is as bad as the guy on twitter is making it seem like. Am I missing something here?
Yes. Cops are not randomly sampled.Are the same cops sampled repeatedly over time? If so, wouldn't that be even more of a dis/tortion if they are pooling together repeated observations as if they were independent?

You don't know much about the GSS do you?I actually did use it in one paper, but it had nothing to do with cops, so I don't understand what people are saying here about the cop sample.
People here seem to be asserting that the cops are purposively selected in an oversample. Are the same individual cops included in multiple surveys for some reason?

The small number of cops is not ideal but it also is not as big of a deal as the Twitter person says it is. As others have said, the assertion that the authors "lied" is way too much. The main effect stuff is unreliable but it isn't exactly extreme in terms of sloppiness. The interactions are a big a problem, however.

At what point would an occupationbased subsample have enough power to be statistically representative? This is a subsample of 80. Would 100 be enough? 200? Why?
The GSS is a stratified random sample of the whole US so it is only a probability sample of the whole US, but various scholars regularly look at GSS subsamples for various reasons and then generalize to the whole population of the subsample even though the survey wasn't a random sample of that subpop. For example, they might look at the opinion of women or some ethnic group.
What if 40% of the sample worked in the service industry? Would that 800 N in one year be enough to make generalizations about the opinions of someone working in the service industry that year? If not, what would the threshold be, and why?

Yes. Cops are not randomly sampled.
That's the main point: the results are not just underpowered, they are also wrong and not representative.I am not as familiar with the GSS as many. But how are the cops sampled differently? It seems to me that they are sampled the same as the rest of the GSS (hence why there are so few of them). To me this is different than what the initial concern is  with such a small N it is unlikely you have a random sample of cops. The distinction being cops not being randomly sample and not having a random sample of cops.

The number of cops varies a lot from year to year. In some years, it is a single cop. An overall t test of the difference between cops and noncops, combining all years, is meaningless if the two groups disproportionally come from different time periods. The continuous year specification in no way deals with this.
Also, tons of interactions are bad, interactions in logistic regression are tough, etc. And omitting the coefficients for nonsignificant variables is fishy.

I am not as familiar with the GSS as many. But how are the cops sampled differently? It seems to me that they are sampled the same as the rest of the GSS (hence why there are so few of them). To me this is different than what the initial concern is  with such a small N it is unlikely you have a random sample of cops. The distinction being cops not being randomly sample and not having a random sample of cops.GSS samples the US population, not the population of cops. If there were more cops in the sample, at some point this concern washes out. But with such tiny Ns, it’s a real concern.