Hacker News new | past | comments | ask | show | jobs | submit login

The number of websites is decreasing, which is good:

    clickhouse-cloud :) SELECT date, count() FROM minicrawl_processed WHERE arrayExists(x -> x LIKE '%polyfill.io%', external_scripts_domains) AND date >= now() - INTERVAL 5 DAY GROUP BY date ORDER BY date

       ┌───────date─┬─count()─┐
    1. │ 2024-06-22 │    6401 │
    2. │ 2024-06-23 │    6398 │
    3. │ 2024-06-24 │    6381 │
    4. │ 2024-06-25 │    6325 │
    5. │ 2024-06-26 │    5426 │
       └────────────┴─────────┘

    5 rows in set. Elapsed: 0.204 sec. Processed 15.70 million rows, 584.74 MB (76.87 million rows/s., 2.86 GB/s.)
    Peak memory usage: 70.38 MiB.
PS. If you want to know about this dataset, check https://github.com/ClickHouse/ClickHouse/issues/18842



Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: