Hacker News new | past | comments | ask | show | jobs | submit login
Full text search on 400M US court cases (judyrecords.com)
468 points by richardbarosky on Nov 19, 2020 | hide | past | favorite | 163 comments



It's pretty hilarious and somewhat frightening I found my dad's arrest 25 years ago for a speeding ticket he had "forgotten" to pay. I remember being 11 years old and having to wait 8 hours for my parents to come back from picking up a pizza. Data availability is crazy.


the frightening part is although your father's record is available to the public, police officers who are caught lying while testifying get to seal the record.

in many jurisdictions, sealing a record is the equivalent of destroying it.

your crimes will haunt you forever because the system never forgets, meanwhile they simply go back to business like it never happened

ref https://www.google.com/amp/s/www.nytimes.com/2018/03/18/nyre...


I think judges can seal records for anyone. A friend of mine had it done after his conviction several years ago. Sure enough I can't find him in the database. He still notifies potential employers about it though.


He’s legally required to.


Found an assault charge on my Mom from '92.


Do you think the data should be removed from the government portals? Those are interesting points. What do you think is the right balance to strike?

I can see why it might be surprising to find some results when searching. The same data has already been available in many other databases that have existed long before this one and in those described on the info page as well.


It's on the internet forever now. If there's a balance to strike it would have had to have been done in 2007 when the court digitized their records and put them online.

A search for "minor consuming" reveals a few hundred thousand cases against children. I'm a little surprised to see that.


Minor in this context is often 18 to 20 years old.


And drinking at that age is legal in many places.

I’d imagine there are a fair few people on that list who’s crimes are for things that are now legal.


Arrest for a speeding ticket? Good God.


The arrest was for failing to appear in court.


In other countries, if you are speeding, you get a ticket by mail. If you are driving under the influence, you are sent to jail for the night.

Why do taxpayers have to pay expensive court proceedings, and offenders have to spend a lot of money for an attorney, and waste a bunch of time.


>Why do taxpayers have to pay expensive court proceedings, and offenders have to spend a lot of money for an attorney, and waste a bunch of time.

They don't have to. The usual process for something like a speeding violation in the US is:

1. You get stopped for speeding (or get caught on a speedcam).

2. You receive the ticket in the mail.

3. At this point, you have an option to agree with it and pay the fine OR appear in court and hope they will rule in your favor (which could easily happen if you genuinely believe they were wrong; and half the time, the cop himself will fail to appear in court anyway, so you get the ticket dismissed if it wasn't anything too wild).

You don't have to appear in court (if you choose to accept the ticket and pay the fine) or waste money on attorneys (if you choose to contest the ticket in court). You can literally play it the same way in the US as you just described, by getting the ticket in your mailbox and paying it off (for something like speeding). That's it. Contesting the ticket in court is just another option available to you.

What happened to the parent commenter who mentioned failing to appear in court, they basically didn't pay the ticket they received (aka ignored it) and didn't show up in court to contest it either. That's pretty much it.


Ah yeah suppose that can escalate.


Well I don’t even know what’s going on here. https://www.judyrecords.com/record/hxv7x79e609


"In Nix v. Hedden, 149 U.S. 304, 37 L. Ed. 745, 13 S. Ct. 881, the question presented was whether tomatoes were to be classed as fruit or vegetables under the tariff act. The court found no particular help from the witnesses called, and decided the point through the use of a dictionary, which was not evidence, but an aid to memory and understanding."

https://www.judyrecords.com/record/5sfv13kuy1081


Ahh yes, the John J. Bitch from Slutville who lives on Whore Street. I know him well. But I don't remember him having pink eyes or being 335 pounds.


apparently the search box repopulates from a cookie. Last thing I search was my name. I clicked your link and freaked out for a second.


I'm going to go out on a limb here and say this is probably a test record for whatever software they use.


Compelled prostitution of a person under 17, duh. I mean, obviously. By a person named "bitch" in "whoresville, usa". It was all authorized by "PIMPDADDY", as shown there, plain as day.


"John Julio Bitch" there is quite the baller. Released the very same day for a $25 million cash bond. He's also apparently an albino who stands 5'1" tall and weighs 335 pounds.


That is amazing.

There are lots that are amusing, with most names or insults getting a hit. It’s seems the ‘AKA’ field has all sorts of entries.


I feel this is bad news. Some things should be forgotten. In my country your record gets soft wiped after 8 years. With this the employer could just look up your name.


I found my conviction for assault from a bar fight 23 years ago. Since then I quit drinking, went to college, raised a child who is now 20, and turned everything around. It's pretty disheartening to see it can be found by anyone 23 years later. Unfortunately, I'm not surprised in the least that we allow this in the US.


Have you tried to get your record purged?


I'm looking into that right now again. It's a pretty tedious process where the Governor has to personally approve the request. Right now there is a Republican governor in that state so they're less likely to approve it. Since background checks can only go back 7 years, 10 in some states, I let it go and decided it wasn't worth it considering that. I thought it was behind me, but this definitely changes things. Thanks for mentioning it.


The general idea has various problems. For example, would newspapers or accounts of things/people, in various media, of objectively public information be required to be retroactively removed from any mention? Does it make sense to force and dictate what entities/individuals can do with basic information at the discretion of anyone who doesn't like it? Just a few thoughts. The records exist in the database because they are public information. If a record is removed from public view, that's done when requested because it's the right thing to do, although there is no strict legal obligation to do so.


How does Europe’s “right to be forgotten” handle it?


Not sure. Maybe if someone doesn't like the Google search results, they make a complaint, and Google has to do what they want.

There are many other public records databases that have similar data, including the federdal judiciary and many state courts across the country.

Some are listed on the info page: https://www.judyrecords.com/info


From the the reddit link[1]

Sorry, I'm just curious.

It says MySQL 8 and Elasticsearch 7.8. I don't have much experience in elasticsearch, I wanted to know how does elasticsearch makes it faster? Is it like an extension that makes it faster? Or Elasticsearch has its own data store that consumes data from the database and magically makes it faster?

Thanks.

[1]https://www.reddit.com/r/programming/comments/jg4rkv/how_a_s...


Elasticsearch, Lucene under the hood, implements an inverted index which is an extremely fast data structure for text search. ES has clustering as a primary feature too and many search features that can significantly improve relevance that you won't find in MySQL and most other databases.


Have you tried Toshi[1] or MeiliSearch[2]. I wonder how it would compare in terms of operational costs (monthly cloud hosting bill) at the current data size.

[1]: https://news.ycombinator.com/item?id=18895655

[2]: https://news.ycombinator.com/item?id=22685831


Do you have the structured dataset somewhere? I’d love to index it in Typesense [1] and see how it does.

I recently tried a 32M songs dataset [2] and it works great, so I’m on the lookout for larger datasets to benchmark with.

[1] https://news.ycombinator.com/item?id=22181437

[2] https://songs-search.typesense.org/


Plus it does not accept joins. So you basically have to denormalize all your data before injecting into Elastic. It helps speedup things. But is a headache to manage on a day to day basis.


Yeah. What I do is create a view that does all the joins then the middleware just needs to do "SELECT * FROM my_view". If the DB has good JSON support, I will also convert the data into an ES index request with SQL so the middleware becomes even simpler.


Let’s say you have a mostly read-only DB (otherwise things are different).

Does it work when the view is insanely big? [i am not an expert at DBs, so my vision of a big DB might amuse you, but let’s say I have millions of rows to assemble as a view].


How would you say it stands up to splunk these days?


Elasticsearch is a search platform. A "database" but meant for search stuff. It's not part of MySQL.



The gist is Elasticsearch is a full-index database. Whatever data goes in gets indexed as compared to only indexing certain fields in MySQL on which you perform search frequently. Think of Elasticsearch as MongoDB + full-indexing. It's a document storage with blazing fast search and aggregation.


https://www.courtlistener.com has more useful features and is part of the Free Law Project.


I've noted CourtListener on the info page: https://www.judyrecords.com/info

"PACER notwithstanding, CourtListener is the most powerful case law research tool available online — and in many ways is much more powerful."

This is based on CourtListener's 4 million+ written court opinions, which judyrecords has recently integrated. But you're right, CourtListener has more case law research features.


I just managed to find the home address of a YouTuber I'm a fan of in 15 seconds. Creepy site. Glad I'm not in the US


Interesting point. If you know the state where someone lives, you can look up the same info on the government website. Additionally, many many other databases have same public data but they ask for a payment to search.


Whoa. Searched my name and found my sealed court records from when I was barely a teen (25 years ago). The records even state "sealed/exempt from public" at the top.

This is pretty neat as I have never seen the records, even though I requested them (out if curiosity) 10 years ago, only to be told they had been destroyed years prior.


Is the case available on the court portal? You can email me richardbarosky@gmail.com if its been removed since it was retrieved.


Can you explain why the poster would want to email you?

Edit: I think I got there eventually - this is something you made?


Does anyone know if the race data is used for crime statistics? I did a quick sample of people I know, and almost every South Asian was miscategorized as Black or White.


The race data in court case records is very often used for crime statistics. It's probably the most analyzed data point after what the incident was about.


No it’s likely not, arrest records are separate from traffic citations and are two different databases. Also, your race may come from the cop filling out the ticket, or it may come fr9m your license in more advanced jurisdictions. The source for crime statistics is usually not court records, those are held by the courts.


Wish we would have known about this years ago. One search would have prevented the hire is someone that ended up costing us a ton of money. Most background searches don’t get local or state court cases like this without major expense that small businesses can’t afford.


Many similar databases and people finder sites are behind a paywall. There are a lot of positives to being able to use public records data available to make more informed decisions, whether it's to let your kid stay at someone's house you don't know or whatever it might be. Thanks.


Interested in how large this dataset is?

Is it in a format that could be backed up by a community to protect? Seems like something folks in /r/datahoarder would be interested in backing up.


15KB is maybe the average case size, including HTTP request data.

That's 1024 * 15 * 439,000,000 = 6.7TB roughly.

The cases are all compressed, so I'm not using 6.7TB non-compressed for cases. But there are other request and non-request related records needed too. Just my backups currently.


Being as you're offering use of the site for free, would you be open to the idea of also offering publicly available DB dumps? There's plenty of fun projects that I can imagine doing if I had that data locally.


One Reddit user estimated the monthly cost of this site at over $2000 USD. How are you funding that?

https://www.reddit.com/r/programming/comments/jg4rkv/comment...


I've downgraded from that. I talked about that in that post. It was most definitely a knee-jerk reation to getting slashdotted on a popular subreddit and not wanting that to happen again. However, still on some very good hardware and handling current workload pretty well right now. That estimate was high.


Bullet points on what you downgraded to cut costs? Curious technical minds want to know.


Sure, I'll post after the dust settles. Server getting smashed but still handling searches pretty dang well.

Some sites crash from the page views, and here I have to handle everyone searching 400 million documents too.


Odds are this won’t help you, but just in case you haven’t seen it.

https://blog.burntsushi.net/transducers/



333 N Warcraft Lane Undercity, Washington 99999

Looks like a place I would like to live in.


Ah, good one. There are many nuggets in there. "holy shit", fart, etc.


Any idea how those came into the system?

The one I quoted seems to be some kind of test case?


Most likely, just like the asdf occurrences.


I'm not sure why I didn't expect them to be in this database, but this also has like traffic tickets and similar.


I understand the open court argument, we need to see what goes on so nothing funny happens there. But unless we're talking about a major crime, what good does it do to list and index on Google everything from 30 years ago?

I am no fan of this at all.


If our society decides it is necessary to act with the full weight of the law behind it, then it would seem better to have the information available for the public to verify than not. I'm not saying it is all great, but that it is far better to have information available so that things like average sentence length for a given crime based on demographic and psychographic information can be queried by all. If a city that is 50/50 male/female and 20/80 black/non-black finds their speeding tickets are 70/30 male female and 35/65 black/non-black, then it may be worth investigating to see if police are being fair who they give warnings to, who gets reduces tickets, and who gets neither.

As for major privacy concerns, it is generally the more major crimes that have the larger issue with the victim being known. Knowing that some one was the victim of mischief vandalism is far less a privacy invasion than knowing they were the victim of sexual assault of a child (and even hiding the victim's identity often doesn't do more than hide the name from a passive search).

Then there are the benefits that other posters have raised, such as being useful for knowing past decisions used even in minor trials.


The general privacy issue that most jurisdictions have decided they just don't care that much about is that easy, indexed, free access to public records is different from the case where that same information is in a dusty file cabinet somewhere. There are a lot of things that people are, in principle, OK with being a matter of public record but are maybe less OK with their neighbor being able to casually discover it through Google.


Totally agree. I'd be all for open court records, requested in person, received in paper form against a small processing fee.

I do have a different cultural background so it's probably natural this feel horrible. Everything about this site would be so illegal in my home country it's almost hilarious in comparison. I'm used to (and fully approve of) a law that you can't keep a list of names in a notebook without a proper reason and everyone's consent, that would already be an illegal register.


So a Christmas card list would be illegal? That seems...excessive.


Good points.

If you look at the info page there is a specific example about how to look up codes of cases that had the same charge.

Being able to see how other offenders are sentenced is useful to make sure people are being treated fairly. Lawyers use this kind of data up to the point of producing analytics from data like that to understand outcomes. Major legal data companies have a large segment of business doing analytics for lawyers handling high and lower level cases.

Here are a few related links: https://cluesearch.org/ https://measuresforjustice.org/


Worse, there's no obvious business model or disclosed funding source or institutional affiliation here.

That leaves me with the distinct impression that they're monetizing data about visitors and searches in some horrible way. (Data targeting for mugshot shakedown operations?)

I'm not going near this.


Maybe ads at some point.


Sometimes even seemingly trivial cases can be caselaw precedent that people should be able to see and access without paying (they are public records).


Good point. PACER, in fact, has been called out by major news publications for literally being a scam the way the change for access to public records.

https://www.politico.com/magazine/story/2019/03/20/pacer-cou...


Only 3 pages are indexed on Google. Actually, most of the other legal databases (listed on info page) have their cases indexed on Google. However, judyrecords cases aren't indexed on Google. I understand your general sentiment.


How did you get google to index that many pages? For me Google only crawls about 1000 pages per day, no matter how many I show in the index


Results like "MEETING ID" and "PASSWORD" for zoom meetings show up way more than any other video conferencing tool for 2020 cases.


Many Zoom meetings are recurring and this might not be safe


Looked at one record as an example and sure enough, the same meeting ID and password is found in 709 different cases in Cleveland, OH.


On a whim, I decided to search for "quicksort", and found a judgment where a loan company was trying to sue for infringement on the grounds that a competitor copied the SQL schema of their product. The complaint was upheld.

https://www.judyrecords.com/record/3nqi41qycaf9


wow...

"The Court finds that New Century had access to the SQL Data [pg. 536] Structures and that there is enough probative similarity to find that New Century factually copied the SQL Data Structures."

The next question might be to have 'Positive Software' demonstrate that they did not, in fact, take their table schemas from some place else. Like... textbooks? Or... example database schemas from vendors. Or tutorial sites? Or competing products?

There may be something extremely unique about part of their structure, perhaps, but... at the same time, there's often very little variety in how most similar data (crm/sales/lead gen/etc) might be stored to be remotely usable for reportin anyway.

"misappropriation of confidential information". Without seeing the structures in question it may be hard to say, but typically 'confidential info' is qualified with "not elsewhere available"-style clauses.

"... Likewise, the Court finds that there are more than one or a few ways to organize the data structures required for programs such as LoanTrack and LoanForce..."

Yeah, but usually there's only one good way to do stuff. Yes I could just have one row with 940 columns - technically, I could make my program work with that - but it's extremely suboptimal - regardless of whether I've seen anyone else's table structures or not.


Interesting (obviously these aren't all the same person):

Page 1 of 1,763 total cases for: donald j. trump Page 1 of 2,299 total cases for: donald trump


It's 80 cases when searching: "donald j trump"~4

This is a proximity search, to ensure it's actually turning up one of the various permutations of the name (as different court protocols may refer by surname first), rather than documents that just happen to contain each of the terms somewhere.

For fairness, "hillary rodham clinton"~4 turns up 193 cases.

Relevant doc: https://www.judyrecords.com/info (down the page, under "proximity search")


I am not fond exposing this kind of info. Don't we all have enough prying eyes


I've mentioned other legal databases on the info page. It's public information. judyrecords is the largest free database of court cases, but there are many other free/not free ones as well.


I did not mean this one in particular. Just my opinion about the subject in general.


In my state you can get some kind of understanding of whats going on, but it's so legalese vague that half the time you only know if someone got a speeding ticket, underage, or divorced.


Since session cookies are required, here is a simple script for judyrecords searching from command line. It uses links browser and tmux.

    #!/bin/sh

    # usage: 1.sh [query] -- perform search
    # usage: 1.sh -- process results page 1 to 200
    # usage: n=5 1.sh -- process results page 5 to 200
    # usage: n=201 1.sh -- quit

    # start tmux if not already running then detach

    j=https://www.judyrecords.com;
    case $# in
    1)
    tmux set set-remain-on-exit on;
    tmux neww links;
    tmux send g $j/addSearchJob?search="$@" c-m;
    sleep 1.5;
    tmux send d;
    tmux capturep -p|sed -n /./p;
    tmux send g $j/getSearchJobStatus c-m;
    sleep 1.5;
    tmux send d;
    tmux capture -p|sed -n /./p;
    ;;0)
    test $n||n=1;while true;do test $n -le 200||break;
    tmux send g $j/getSearchResults?page=$n c-m;
    sleep 2;
    tmux send Down Down '\' 
    # small monitor where results page HTML takes 4 spacebar presses to get to bottom; 
    m=0;while true;do test $m -le 3||break;
    # process results -- e.g., print record URLs;
    tmux capturep -p|sed -n "/href=..record/{s|.*record.|$j/record/|;s/\"//;p;}";
    tmux send Space;
    m=$((m+1));done;n=$((n+1));done;
    tmux killw;
    esac


Updated and improved

    #!/bin/sh

    j=https://www.judyrecords.com;
    case $# in
    1)
    tmux new -P -d links;
    tmux set set-remain-on-exit on;
    tmux send g;
    tmux send $j/addSearchJob?search="$@";
    tmux send c-m;
    sleep 1.7;
    tmux send d;
    tmux capturep -p|sed -n /./p;
    tmux send g;
    tmux send $j/getSearchJobStatus;
    tmux send c-m;
    sleep 1.7;
    tmux send d;
    tmux capture -p|sed -n /./p;
    ;;0)
    test $n||n=1;while true;do test $n -le 200||break;
    tmux send Down 
    tmux send Down 
    tmux send g 
    tmux send $j/getSearchResults?page=$n 
    tmux send c-m;
    sleep 2;
    tmux send Down 
    tmux send Down 
    tmux send Escape;
    tmux send F 
    tmux send v 
    tmux send c-u 
    tmux send 1.htm 
    tmux send c-m 
    tmux send o 
    sed -n "/href=\"\/record/{s,.*record\/,$j/record/,;s,\",,;p;}" 1.htm;
    __grepq=$(exec sed -n '/a class=\"goToNextPage/!d;=;q' 1.htm);
    test ${#__grepq} -gt 0||break;
    n=$((n+1));
    done;
    tmux killw ;
    esac


Interesting. Haven't seen links being used in a while. Thanks for posting.


What a clean interface. We need more website to look like this.


Thank you


Weapons of math destruction. This would be one of them. The data here is emvarassing for individuals and it can be looked up for decades in history.

I know this was always public but this makes it too easy for masses to dig through the troves.

Scares me. Next thing I see is some AR glasses that do facial recognition and correlate name -> public records. Could be a nasty blackmail tactic. Some things are close to Black mirror in reality.


So the data should be public (as it always has been), but it should not be easily accessible?


No. Data should be erased from records after a couple of years for individuals with pretty crimes.


I wasn't trying to be an asshole, just honestly searched for "javascript". Was disappointed :)


Looking at the results, those all appear to be from CourtListener's bulk data.


There are only 532 cases, so it's not too bad.


The search is very quick. Does anybody know how their tech stack looks like?



From Reddit thread:

> MySQL 8 is used for DB. The seach server uses elasticsearch 7.8.


Sounds like that would be an easy use case for elasticsearch indeed. I've seen it handle much bigger data sets. Solr would work as well. There are probably a few other options on the market but elasticsearch would probably do pretty well on this even without a lot of tuning.

For reference, I once threw the entirity of open streetmaps at it before it even hit 1.0 to implement a simple reveres geocoding thing. Basically a couple hundred million street segments, some polygons, etc. At the time the geospatial support wasn't great and very new and very CPU intensive. I got away with indexing all of that and running it on a single node cluster with a xeon and 32G of RAM and spinning disk (RAID 1, no SSD). It worked great. Very responsive. Indexing only took about 50 minutes or so. Most of that was my parsing logic. That's not comparable of course, I'd expect this to be faster on the same hardware with a current version of Elasticsearch. They've made a lot of leaps with improving performance, memory usage, cpu usage, disk usage, robustness, etc. in the 7 major versions since then.


This is courtlistener.com data correct?


From other comment: CourtListener has about 4 million opinions, which are included. On top of that, 435 million additional cases from throughout the US.


Interesting. Any stats/aggregations on numbers per resource type (e.g., 56k for scotus, 36k for D.C. Circuit Court etc)?


Great site, fast and simple.

When I type a query and press search, would like it if the URL updated with the search in the query string. It would make it easier to share specific queries.


Good point, that would definitely be an improvement. Thank you.


This is awesome. Where did the data come from?


Thanks. All the data is collected from various government databases.


How did you do that? Did you have to implement a scraper for each county?


This is where its nice to have a common name. Honestly its worth changing your first and last name to something generic.


I don't see a breakdown by source. What does this have that courtlistener doesn't, for example?


CourtListener has about 4 million opinions, which are included. On top of that, 435 million additional cases from throughout the US.


Where are they getting public domain opinions that CL doesn't have? Are these states or counties that CL doesn't scrape? It would be nice to have a breakdown by jurisdiction.

Also, by "case" do you mean "opinions"?

Full disclosure, I've written and contributed to several scrapers for CL, and if there's a large source they're missing I'd like to know.

Note that the CL opinion number you're quoting doesn't include orders from Federal courts that are in the RECAP collection, which accounts for several million additional opinions.


trellis.law does something similar

their searches are indexed and have rulings and documents as well.

does this differ from that service?


Congratulations on the launch. I have worked in open source and public record research for the last 15 years, and your coverage is extremely impressive.

Do you have any long term plan for the site? I can see this going in a lot of different directions depending on your goals.


Thanks, as far as I know it's the largest database of court cases on the Internet. If there's enough traffic I'll support the site with ads. Don't have any other specific plans currently.


I run a similar free site and was looking at add ads. Google Adsense rejected it for not complying with their program policies. My data is on large US federal bankruptcies, so I really couldn’t pin point why but just a heads up that it might be more difficult.


Fascinating! Was surprised to see random infractions

Does this have the lower trial court records too?


Yes, it has records from different trial courts.


Is there a list of jurisdictions and courts that it has?


Glad the record I got expunged from 20+ years ago is in there. Really nice.


You might consider giving credit to the sources of data used to make this.


All the data is from government databases directly, aside from CourtListener, which was recently integrated. It would be good to specifically mention CourtListener's contribution.


How did you get all that data from government databases directly? Do they provide some sort an API for bulk export?


It is all be public records. The source of the original data is the court system. If a 3rd party physically scrapped it from the court system, others should be able to digitally scrape it.


It's throwing a 500 for some regex I fed it.


I recently added advanced query support. Looks like I need to clean up some validation. Thanks.


This is great I had no idea how many times people tried to end Prop 13 using the courts or how litigious HJTA was.

Hard to read on mobile though


Search for "analytics" and then you realize that there is a collection company that sues the hell out of people.


This dataset would be yummy for GPT3


Interesting, I'll check it out. Thanks for the link.


Didn't Aaron Swartz try to do this but couldn't because it costs $0.10 per page?


From what I understand, he had some kind of academic library access for PACER and used that to bypass what others would be changed for. There are lawsuits against PACER charging fees for what's public information generated by taxpayer money. He ended up being charged with various crimes related to maybe computer fraud and eventually committed suicide. A very sad story.


Jeez, 36 pages with resume for "Napster"!


Wow a seat belt violation from 2004!


What stack did you use for this?


See other comment link to reddit.


lol so many records that should have been destroyed and not indexable!

so do I get a court order for each county, the website, the resyndicating source that the website uses or what?

I looked at the reddit page and other people noticed the same thing, the author just said send me the link! Hahaha one by one removal maybe!

Shut it down, enjoy it while it lasts


I don't think you understand what you're talking about. There are many databases that are made up of public records. Many aren't free, some are.


That may be the reality but if the court or due process ordered something expunged from a record it should be updated in all records and the details not present.

Should just do a search for expunged or similar terms and remove those entries.


> lol so many records that should have been destroyed and not indexable!

You want secret courts?


Do you want things that children do to follow them for the rest of their lives?


Well, no. Are there names of minors in this database? I thought the US had a mechanism to prevent that, or at least to petition to have records of minors removed or anonymized.


Yes. The mechanisms are shit. Many of these cases are juvenile cases with a note saying the case is sealed, along with full details of the charge, name, and outcome.

Edit: wow, plus family court stuff like a four year custody dispute, kids being adopted, etc


"The US" has 39,044 distinct local governments and municipalities and they all do their procedural nuances differently and to varying efficacy and different points in time! :D


I don't know what culture you come from, but in the US and UK and similarly influenced cultures justice being seen to be done and recorded is a pretty important principle and mechanism against overreach of the state.


I agree. But it's worth noting that the UK has recently enacted the Right to be Forgotten Law, which plays into this discussion.


Of course, once data is replicated and distributed around, it's very hard to put the genie back into the bottle.


There are significant limits to that, such as juvenile courts.


The following itself is not the issue right?


they weren't secret and were available for public perusal and judgement until the designated time

secret courts have cases that are secret from the beginning


how was this data acquired? did you scrape government websites for this data?


This is great. Its like google before it become evil.


Thank you


hahaha - just found dirt on like 1/4 of the people I know.


[flagged]


Wow, I really wish I hadn't jumped down that rabbit hole.


Feel free to join in if you're interested in gender dyanmics stuff


Super cool--and very fast! Anyone looking to collaborate on these can easily add Kontxt (https://www.kontxt.io) right on to them and have localized discussions directly on page-parts.


Thanks. I saw your post on reddit a while back. Was going to ask about your tech stack.


I used React client-side, Node server-side, and MySQL as the db. I only mentioned Kontxt here because I demoed it for Thomson Reuters because it could be helpful for their legal professionals as a collaboration tool after they find documents via their WestLaw legal search product, and your tool reminded me of it. I actually used Kontxt as a sales pitch to highlight their annual report and add some calculations and explanations about how much money they could make. Nice work, again!




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: