Most of what I do is just Adobe Acrobat redaction tool + find/replace + manual review. The redactions guidelines I have to follow for discovery is much more extensive than just names so this necessarily has to be done manually, page by page. In addition to the diverse amount of information that requires redaction, the discovery we get in criminal cases tends to be very haphazardly organized. I'll get a giant PDF file with 1000+ pages and there's random police reports, handwritten autopsy report, photos, pictures of thumb drives with additional discovery, etc etc. You generally should expect nothing to be OCRed. All of those are annoying hurdles, but we can get funding for paralegals to help.
Redaction involving large data troves relies on proprietary enterprise-level software specifically built for discovery review. Generally the relevant filter for reviewing discovery is whether or not it's responsive or whether it falls under any privileges (such as trade secret or attorney-client) and this requires actual attorneys to do document review: seriously the worst job ever. This usually is farmed out to contractors who hire painfully bored attorneys to do the grunt work.
Generally it'll be organized in multiple filters where the lowest tier just clicks responsive/non-responsive, and then the next tier clicks between privileged/non-privileged. Anything more nuanced than that will get specialized attention from better paid attorneys.
I'm not familiar with software that is specifically built for _just_ redactions, but I'm assuming the government agencies I deal with have something like that in place. They tend to response fairly quickly to FOIA requests (a few weeks to a few months) even when there are a lot of redactions to impose.
From an IT guy point of view: the fact that e-mail is a single format doesn't mean that it is easier to redact. In a sense, "books" are a single format too, but they can be written in all sorts of styles, languages and contain any data. Same with e-mails: people use them creatively.
40 GB is a lot of data from human point of view. We are used to huge disks nowadays, but in pure text, 40 GB would be something like 14 000 typical books. Of course, those e-mails aren't pure text, but it is still a lot of information-rich content.
In this context, it depends a lot on what you mean by "anonymization". Does it mean just stripping e-mail addresses and proper names away? That can be done relatively quickly, but people leave other traces of their identity in texts, which are much harder to detect automatically. If someone mentions that he likes to go to a certain restaurant for its excellent Merlot, it is already quite a good identification of that person.
At the end of the day, it all depends on what level of anonymity you require. If it is only expected that a bored court clerk looks over the records once, maybe it is enough to redact out e-mails and proper names. But if the entire package of data can be expected to be leaked online with a command "sleuths of the Revolution, identify all the fascists to get them canceled", then 150 thousand dollars may even not buy you the required level of anonymity.
Yes, those are all fair and important concerns. When I said "single format" I was trying to think of the best way to distinguish from the mixed-media mess I typically get in discovery. I don't have enough information to conclusively claim that $150k is a fraudulent estimate.
“The government is debanking people being anti immigration”
Becomes
“Existing legislation enacted in 2009 requires volatile assets to be limited to 15% of bank receipts and therefore banks have declined to deposit the untaxed earnings of known crypto scammers”
Even if you think he committed a crime, he was never convicted of one, and that's not what you said in your comment. You implied that only crypto scammers were debanked, not political dissidents.
Yes, deleted my account. It was fun back in the day to chat and socialize with cool people, but over time I'd just see more and more verified users with very poor quality replies and engagement. Way too much unimaginative "ooga booga racist slop". It wasn't a problem because I got offended (lol no) but constantly getting called sand nigger is very boring, they could've at least been creative about it.
Ironic that you accuse me of "intellectual insecurity" when this piece seems to be mostly a cope to cover your embarrassment about how the New York government's prosecution of VDARE makes liberals look bad.
Sadly, I don't think this works if considered in detail: "Our ancestors have squared off against much more dire tribulations ...". Semi-hypothetically, I'd assert it's quite reasonable for a person to say they don't want go through a burdensome legal process, or piles of social media abuse, even if elsewhere brave freedom-fighters have endured torture and death for the cause of liberty. There's a bad moralism which tends to end up lecturing the abused about how they aren't being noble enough when facing abusers. I was driven to abandon technology activism myself due to lawsuit fears and smear campaigns. One of many frustrating aspects was dealing with people preaching at me to be infinitely self-sacrificing (and woe if, perhaps exhausted and stressed-out, I didn't respond in accommodating good humor).
Now, I don't want to take a position on VDARE in specific, since I don't know any details about the claims. But I believe this article mixes up too many things. There's the issue of inflammatory rhetorical characterizations. However, that's different from whether the costs of free speech can be made very high. And further whether a particular incident counts as an example. Plus many speculations as to mental motivation. I don't think it works well to mash these all together so much.
> I'd assert it's quite reasonable for a person to say they don't want go through a burdensome legal process, or piles of social media abuse, even if elsewhere brave freedom-fighters have endured torture and death for the cause of liberty.
You misunderstand me, perhaps my fault for articulating myself badly. I completely agree those are all reasonable positions and I have never faulted anyone for wanting to avoid abuse. If it's too dangerous or burdensome or even inconvenient to speak one's mind, staying silent is a defensible position. What I do not and cannot tolerate however, are self-styled intellectuals who either lie or obfuscate their positions. It's best for someone to stay silent rather than poison the discourse with dishonesty.
Also, my post wasn't intended to be about VDARE or the merits of its case specifically, but rather about the obfuscation that case was being talked about. VDARE tends to be fairly transparent about their views AFAIK so it wouldn't make sense to accuse them of cowardice.
> Plus many speculations as to mental motivation.
I'm reaching conclusions based on observable facts. Anyone is welcome to point out any flaws I made in my journey. Simon in particular is _especially_ welcome to correct any mistakes I made.
🙂 That bit of his about "your bitter, seething tribal hatred" probably qualifies as his "precious bodily fluids" moment, his own unmasking as something of a fraud and charlatan at best.
But quite interesting and thorough essay. Somewhat offhand, or maybe speaking to your various points, it reminds me of Trivers' "The Folly of Fools: The Logic of Deceit and Self-Deception in Human Life":
Not that I've read it myself. But this seems to capture much of what you're saying:
Wikipedia: "While the evolutionary benefits to deceiving other organisms are obvious at first glance it seems highly counter-intuitive to think that it could ever be in the evolutionary interest of an organism to deceive itself."
When you can fake sincerity then you have it made. But kind of think that that is only part of the picture, the other part being various "unexamined assumptions" -- seems we all have them, are all kind of obliged to proceed as if they were true. Which can be rather problematic indeed:
“It ain't what you don't know that gets you into trouble. It's what you know for sure that just ain't so.” ― Mark Twain
Side note, but I'd love to read a technical overview of how redacting specific types of information from large data troves works.
Most of what I do is just Adobe Acrobat redaction tool + find/replace + manual review. The redactions guidelines I have to follow for discovery is much more extensive than just names so this necessarily has to be done manually, page by page. In addition to the diverse amount of information that requires redaction, the discovery we get in criminal cases tends to be very haphazardly organized. I'll get a giant PDF file with 1000+ pages and there's random police reports, handwritten autopsy report, photos, pictures of thumb drives with additional discovery, etc etc. You generally should expect nothing to be OCRed. All of those are annoying hurdles, but we can get funding for paralegals to help.
Redaction involving large data troves relies on proprietary enterprise-level software specifically built for discovery review. Generally the relevant filter for reviewing discovery is whether or not it's responsive or whether it falls under any privileges (such as trade secret or attorney-client) and this requires actual attorneys to do document review: seriously the worst job ever. This usually is farmed out to contractors who hire painfully bored attorneys to do the grunt work.
Generally it'll be organized in multiple filters where the lowest tier just clicks responsive/non-responsive, and then the next tier clicks between privileged/non-privileged. Anything more nuanced than that will get specialized attention from better paid attorneys.
I'm not familiar with software that is specifically built for _just_ redactions, but I'm assuming the government agencies I deal with have something like that in place. They tend to response fairly quickly to FOIA requests (a few weeks to a few months) even when there are a lot of redactions to impose.
From an IT guy point of view: the fact that e-mail is a single format doesn't mean that it is easier to redact. In a sense, "books" are a single format too, but they can be written in all sorts of styles, languages and contain any data. Same with e-mails: people use them creatively.
40 GB is a lot of data from human point of view. We are used to huge disks nowadays, but in pure text, 40 GB would be something like 14 000 typical books. Of course, those e-mails aren't pure text, but it is still a lot of information-rich content.
In this context, it depends a lot on what you mean by "anonymization". Does it mean just stripping e-mail addresses and proper names away? That can be done relatively quickly, but people leave other traces of their identity in texts, which are much harder to detect automatically. If someone mentions that he likes to go to a certain restaurant for its excellent Merlot, it is already quite a good identification of that person.
At the end of the day, it all depends on what level of anonymity you require. If it is only expected that a bored court clerk looks over the records once, maybe it is enough to redact out e-mails and proper names. But if the entire package of data can be expected to be leaked online with a command "sleuths of the Revolution, identify all the fascists to get them canceled", then 150 thousand dollars may even not buy you the required level of anonymity.
Yes, those are all fair and important concerns. When I said "single format" I was trying to think of the best way to distinguish from the mixed-media mess I typically get in discovery. I don't have enough information to conclusively claim that $150k is a fraudulent estimate.
On meme phrases, you should check out Marc Andreeson’s recent rant on debanking to Rogan
Perfectly on point
“The government is debanking people being anti immigration”
Becomes
“Existing legislation enacted in 2009 requires volatile assets to be limited to 15% of bank receipts and therefore banks have declined to deposit the untaxed earnings of known crypto scammers”
Haha
You're either lying or you're ignorant. Nick Fuentes was debanked and it had nothing to do with crypto scams.
Was this in early 2021? I wonder what else he was up to around that time…
Even if you think he committed a crime, he was never convicted of one, and that's not what you said in your comment. You implied that only crypto scammers were debanked, not political dissidents.
To be clear, the original claim was that the CFPB is influencing banks to systemically debank right wingers.
I don’t really agree with JP Morgan debanking individual figures after a scandal but that was all very public and quite different.
Which other people have been debanked for “being anti-immigration” as Marc said
Oh man, that's perfectly on point regarding the obfuscation I'm referring to. Also, Rogan's credulity on display.
Did you leave twitter btw?
Yes, deleted my account. It was fun back in the day to chat and socialize with cool people, but over time I'd just see more and more verified users with very poor quality replies and engagement. Way too much unimaginative "ooga booga racist slop". It wasn't a problem because I got offended (lol no) but constantly getting called sand nigger is very boring, they could've at least been creative about it.
What an irony. I left Twitter in June 2021, way, way before it was cool to do so...
It was a hysterical cesspool already.
Yeah I’ve been liking substack more and more lately. Twitter is literally a platform designed for meme talking points
> "... how Meghan Murphy hides her aesthetic disgust of the sex industry behind a perpetual game of whack-a-mole ..."
LOL. 👍😉🙂
Ironic that you accuse me of "intellectual insecurity" when this piece seems to be mostly a cope to cover your embarrassment about how the New York government's prosecution of VDARE makes liberals look bad.
Here is my full response: https://substack.com/@simonlaird/note/c-81147254
Sadly, I don't think this works if considered in detail: "Our ancestors have squared off against much more dire tribulations ...". Semi-hypothetically, I'd assert it's quite reasonable for a person to say they don't want go through a burdensome legal process, or piles of social media abuse, even if elsewhere brave freedom-fighters have endured torture and death for the cause of liberty. There's a bad moralism which tends to end up lecturing the abused about how they aren't being noble enough when facing abusers. I was driven to abandon technology activism myself due to lawsuit fears and smear campaigns. One of many frustrating aspects was dealing with people preaching at me to be infinitely self-sacrificing (and woe if, perhaps exhausted and stressed-out, I didn't respond in accommodating good humor).
Now, I don't want to take a position on VDARE in specific, since I don't know any details about the claims. But I believe this article mixes up too many things. There's the issue of inflammatory rhetorical characterizations. However, that's different from whether the costs of free speech can be made very high. And further whether a particular incident counts as an example. Plus many speculations as to mental motivation. I don't think it works well to mash these all together so much.
> I'd assert it's quite reasonable for a person to say they don't want go through a burdensome legal process, or piles of social media abuse, even if elsewhere brave freedom-fighters have endured torture and death for the cause of liberty.
You misunderstand me, perhaps my fault for articulating myself badly. I completely agree those are all reasonable positions and I have never faulted anyone for wanting to avoid abuse. If it's too dangerous or burdensome or even inconvenient to speak one's mind, staying silent is a defensible position. What I do not and cannot tolerate however, are self-styled intellectuals who either lie or obfuscate their positions. It's best for someone to stay silent rather than poison the discourse with dishonesty.
Also, my post wasn't intended to be about VDARE or the merits of its case specifically, but rather about the obfuscation that case was being talked about. VDARE tends to be fairly transparent about their views AFAIK so it wouldn't make sense to accuse them of cowardice.
> Plus many speculations as to mental motivation.
I'm reaching conclusions based on observable facts. Anyone is welcome to point out any flaws I made in my journey. Simon in particular is _especially_ welcome to correct any mistakes I made.
Repent
Simon Laird is an intellectually-dishonest moron. Easily my least favorite 'writer' that pops up on my feed all the time. He sucks.
My comment is not productive, I know, but for real, Laird is a self-serious and dishonest loser.
He's certainly putting in a lot of effort in solidifying that impression
> "Simon Laird, a profile in courage".
🙂 That bit of his about "your bitter, seething tribal hatred" probably qualifies as his "precious bodily fluids" moment, his own unmasking as something of a fraud and charlatan at best.
But quite interesting and thorough essay. Somewhat offhand, or maybe speaking to your various points, it reminds me of Trivers' "The Folly of Fools: The Logic of Deceit and Self-Deception in Human Life":
https://en.wikipedia.org/wiki/The_Folly_of_Fools
Not that I've read it myself. But this seems to capture much of what you're saying:
Wikipedia: "While the evolutionary benefits to deceiving other organisms are obvious at first glance it seems highly counter-intuitive to think that it could ever be in the evolutionary interest of an organism to deceive itself."
When you can fake sincerity then you have it made. But kind of think that that is only part of the picture, the other part being various "unexamined assumptions" -- seems we all have them, are all kind of obliged to proceed as if they were true. Which can be rather problematic indeed:
“It ain't what you don't know that gets you into trouble. It's what you know for sure that just ain't so.” ― Mark Twain
https://www.goodreads.com/quotes/7588008-it-ain-t-what-you-don-t-know-that-gets-you-into
Though often we simply don't want to examine those assumptions -- and for less than edifying reasons:
" 'Conscience doth make cowards of us all' means that our knowledge of the sins we have committed that could send us to hell causes us to fear death."
https://www.quora.com/What-is-the-meaning-of-the-saying-conscience-makes-cowards-of-us-all-If-it-is-true-then-why