lucasschm's comments

lucasschm · on May 27, 2017

Bloom Filters? It has false positives but no false negatives

lucasschm · on May 27, 2017

You are correct, but HyperLogLog has many buckets counting the longest run of zeros in order to avoid the problem of outliers. I recently studied these probabilistic algorithms and did a notebook with code and plots to show their performance: https://github.com/lucasschmidtc/Probabilistic-Algorithms/bl...

snowcrshd · on May 27, 2017

Thanks for sharing that!

Just skimmed through it and seems pretty interesting. I'll read it more in depth later.

lucasschm · on May 27, 2017

No problem. If there are mistakes or a segment is not clear, let me know

meetapoorvgupta · on May 28, 2017

Thanks for the write up, Lucas. It was very intuitive and I learnt a lot.

I noticed that you used 5000 buckets to store the frequency of 7000 non-unique words in the section on 'Counting Bloom Filters'. How is that better than using 7000 buckets and a uniformly distributed hash function, which would maintain frequencies perfectly? We would be using fewer buckets by an order of magnitude in a real-world implementation to save memory.

lucasschm · on May 28, 2017

Yeah, I should have given more thought to that number. Updated the example for N=300. Thanks

anhldbk · on May 27, 2017

Thanks for sharing guy! Interesting repo.

lucasschm · on May 25, 2017

For this subject I would like to offer this counter view: Why China bears are wrong: An interview with Andy Rothman (http://supchina.com/sinica/china-bears-wrong-interview-andy-...)

As it is likely for someone to mention the Ghost cities, I recommend this video https://www.youtube.com/watch?v=AyBBQ-wF87M&list=PLxh5xkC0W-...

seanmcdirmid · on May 25, 2017

Ghost cities are weird: first they talk about the ghost cities, then others say the ghost cities are filling up. If you actually visit, say, Ordos New Town, you'll really get that, no, those ghost cities really exist.

Some will fill up, like Pudong did, I get Tianjin's new financial district will also. But those in areas with little economic hope in the near term (Ordos and dying coal), they really aren't going to happen before the buildings become substantially rundown (given Chinese concrete overbuilding to make use of unskilled migrant labor, these buildings require a lot of maintenance and will look decrepit sooner rather than later).

whazor · on May 25, 2017

I wonder how much percentage of all buildings are empty in China. In my visit I saw plentiful empty skyscrapers, especially next to decaying homes where people would still live.

seanmcdirmid · on May 25, 2017

Local governments have ways of telling, e.g. By electricity usage. You can also try counting the lights on at night to get an idea of apartment occupancy (a fun last time at the apartment complex I used to live in). Someone definitely knows, but you can be damned sure that this information is considered "state secrets."or

bottled_poe · on May 25, 2017

Given that this guy benefits financially from investor sentiment, I would take his advice with a very large grain of salt.

aqsheehy · on May 25, 2017

It's a catch 22, people who know the most about a market are almost certainly invested.

lucasschm · on April 30, 2017

Recently I found out that my knowledge about probabilistic algorithms was quite lacking, so I decided to a jupyter notebook about them. I think some of you will find them as interesting as I did. And if there are any mistakes, let me know.