Thursday, December 8, 2016

Linksfest to Get Started with Apache Flink


Flink, another great data processing platform, has been a rising star this year. It is a high performance stream and batch data processing data platform, with fault-tolerant, scalable, distributed data stream computation at its core.

Here are several links and resources to get you started.

Company and Community
dataArtisans, company behind Flink
Google Trends comparing Flink, Spark and Storm (Spark is still way more popular)


Books

Introduction to Apache Flink, book from Flink core developers, highly recommend to start your Flink journey with. It is a Free download from MapR.

Flink in Action (MEAP, available in Spring 2017), the first chapter (PDF) is Free and gives a good overview.


Quick start guide


Talks and Videos


Slides
Alibaba slides on Blink, their fork of Flink, Alibaba is one of the biggest online e-commerce site in China.


Performance and Benchmark


Setup

2016 Holiday Guide for Robot Toys

Holiday is just around the corner and it's time to order gifts from "Santa". This year, I decided to give my son something different, something other than candies, chocolates, pokemon cards, lego, etc.

Since he has been exposed to basic programming concepts through code.org and Scratch. So..., how about a programmable Robot for this Christmas? Sounds good.

After doing some research and comparison, we ordered Ozobot Evo. To get started out of box, it supports a color-code based language for various actions, e.g. follow the black line and move forward, stop at the red color, rotating at the blue color and play color light and music, etc. You can also customize the action with a mobile app on the phone or tablet with an environment similar to Scratch.

When he grows up a bit more, we might introduce marty the robot to him. It looks and works more like the robots we know about. It also teaches some real mechanical dynamics for the kids.



Here are some notes I took during the research, hope it will be useful to you.


Ozobot Bit
only supports the colored line language
both 1.0 and 2.0 available on amazon (around $60)






Ozobot Evo
More advanced than Bit, supports the colored line language and a Scratch like visual programming language, can control the robot using a mobile app, supports social interactions with friends’ robots.
available on amazon (around $100):





Codeybot
available on amazon: $169.99





Cozmo
A playful companion, a robot that has personality, very cute!
available on amazon (around $300):




marty the robot
This is for more grown-up kids, more makebot-like robot, fully programable, start with Scratch, then move to Python. The way they dance together looks so funny ;-)
not available yet, currently on crowdsourcing





Honeybot
kids education robot and companion, not that programmable.
Founder from Shenzhen, China (around $230)
小哈早教机器人





aido family robot
size of a toddler, family robot, assistant, voice control, helper, etc. Reminds me of Baymax in Big Hero 6 ;-)
available for pre-order (around $600): will ship in early 2017




Saturday, November 26, 2016

Java Concurrency Counters Benchmark




















Java concurrency utilities have kept evolving and provides many different ways to achieve similar tasks. Recently, we had a task to implement a concurrent counter. This triggered my interest in comparing different ways and their performance under various read and write workload.


The end result is a simple concurrent counter implemented in various ways:
The benchmark is implemented using JMH, the standard way for reliable Java performance microbenchmark. You can find several really nice tutorials on JMH in the References section.


In my benchmark, there are write and read operations on the counter. The write takes 10ms and read takes 2ms. I set the number of read and write threads to simulate different mix of the workload scenarios using JMH group.

Both the source code and benchmark raw data, Excel sheets and visualizations can be found in the git repo: java-concurrency-counters-benchmark.

Here is a quick summary based on my experiment (I only set 2 rounds of warmups and 2 rounds of benchmark due to limited time):
  • AtomicLong and LongAdder has similar throughput. In read-heavy workloads, AtomicLong has better read and write throughput than LongAdder. In write-heavy workloads, LongAdder has slightly better write throughput.
  • Fair lock has lower throughput than regular lock in general, but not always.
  • Consider using ReentrantLock or ReentrantReadWriteLock if you need high read throughput and the concurrency level is high.
  • StampedLock provides very good write throughput in all the read-write mixes, if write throughput is important to you, you can try it. At the same time, if you need comparatively good read throughput, try optimistic read StampedLock. It has really good read throughput when concurrency level is high compared with regular StampedLock.

Special thanks and references:

Tuesday, May 17, 2016

Learning Data Visualization

Data visualization provides insightful tools to visually analyze the data, observe the trend, compare the data series, filter out the data noise, etc.

I spent some time learning several most commonly used JavaScript data visualization libraries. It is really exciting to turn monotonous numbers into beautiful charts.

Here is the git repo that has the sample charts I am playing with: https://github.com/guozheng/learn-dataviz

If you want to quickly create charts using available ones, I'd recommend using either HighCharts or Google Charts. If you need to do heavy customization, or you need to create new chart types, then D3.js, NVD3, C3.js, React D3 provides the D3.js based solutions, very powerful and flexible.