RubyFlow : The Ruby Community Blog

Home   Submit   Sign Up   Log In   leaders   Twitter   RSS Feed  

igrigorik — 116 posts

Some people, when confronted with a problem, think: "I know, I'll use UA/device detection!" Now they have two problems... Service Worker solves this problem: instead of guessing on the server, we can (finally) teach the client to report the necessary values!
Current web platform primitives are are not sufficient to deliver an extensible and perf-friendly platform - we need to fix that.
How many font variants is your site using? Do you need all of them, and could you rely on the browser to generate some on your behalf? A look under the hood of how font selection and synthesis works in the browser.
Script-injected scripts block on CSS and delay script execution. Their era has passed, and we now have a much better, cleaner, and faster solution: add an async attribute to your script tags.
WiFi makes no latency promises; 4G incurs scheduling costs but offers more stable performance. To minimize latency, don't trickle data!
Actually, it's likely not any slower for mobile clients: don't confuse relative performance with absolute (latency) savings. Confused? Read on...
A hands-on look at how to measure web font latencies and optimize their use: transfer latencies, time of initial fetch, and interaction with the critical rendering path. Plus, an under-the-hood look at some of the upcoming optimizations in Chrome: font timeouts, faster fetches, and support for Font Load Event API.
TLS is not slow, it's unoptimized. A hands-on tour of optimizing nginx to deliver one-RTT Time To First Byte (TTFB) with TLS.
Crash course on optimizing WebSocket compression: a look under the hood of Deflate compression and how to configure it for best performance.
TLS record size can have significant impact on the page load time performance of your application: keep record size small!
Pure ruby framework and transport agnostic implementation of HTTP 2.0 protocol: new http-2 gem.

I just launched some groovy thing and yada yada.
Client-Hints automates DPR switching without requiring any modifications of our existing HTML and CSS markup. How? Simple and battle-tested HTTP negotiation.
HTTP 2.0 enables the server to send multiple responses for a single client request, which opens up an entire world of new optimization opportunities!
A hands-on look at how to configure Nginx to transparently deliver and cache WebP assets via Accept negotiation.
An average page is now over 1300 kB in size and over 60% of that is in images... WebP provides 30-80% improvement over JPEG and PNG - latest news and updates from WebP team, and example Varnish/Nginx configs for WebP detection.
Chrome gets faster as you use it. Chrome learns the topology of the web, browsing patterns, and critical resources on every page to optimize your browsing! A look under the hood of how it all comes together...
3 hour workshop on web performance from the ground up: what is fast, impact of latency and bandwidth, TCP performance, SPDY protocol, browser parsing and execution, rendering optimizations, critical path, and more.
If we really want to make an impact on web performance, then image formats is the place to do it. There is absolutely no reason why we shouldn't have dozens of specialized formats, each tailored for a specific case and type of image. But before we get there, we need to iron out some kinks...
If you are using Google Analytics, then you have a powerful anomaly detection engine at your disposal... and it can be easily configured to help you monitor the performance of your site: server response times, DNS, page loading times, and more.
Your browser is one of the most and best instrumented development platforms - you may just not realize it yet. Check out these videos to learn how to debug network, rendering, and javascript performance in Chrome, and also learn how to extend devtools with extensions, debugging protocol, and more!
With SSL + NPN support in HAProxy, adding SPDY support to your site has never been easier. A hands on look at the configuration to make it all work. You can now deploy a simple Ruby SPDY server without having to worry about SSL, or NPN!
mod_pagespeed is a just in time (JIT) performance compiler for the web. This free and open-source Apache module automates all of the most popular web-performance best practices by dynamically rewriting and optimizing your website assets. A look under the hood and the architecture of the module within Apache...
An in-depth look at the performance and optimizations behind web fonts, and Google Web Fonts in particular.. Web fonts are here to say, and that's a good thing - yes, even for performance!
Hand's on look at the HTTP Archive data format, which allows us to export, analyze, and visualize network performance data from the network timeline... Learn how to build a performance dashboard in three easy steps, with free and open-source tools!
'High speed' connectivity is not all about bandwidth, latency is the new bottleneck for most web browsing applications - especially, for the mobile web. An overview of what latency is, how it affects us, and the current latency numbers for wired and the mobile web.
Chrome supports SSL-based proxies, which allows us to setup and use fast and secure SPDY proxies! A DIY guide for setting up your own SPDY proxy, ala Amazon Silk.
Modern browsers are much smarter than most give them credit for. Myth #1: All stylesheets block rendering; Myth #2: CSS is always in the critical path. A look at under the hood of how WebKit handles CSS loading.
Network latency is anything but free. To address this, Chrome learns the network topology as you use it via a number of predictor heuristics. Let's take a peek under the hood of the Chrome networking stack and see how this can be applied to building faster web apps.
A look at the numbers behind RailsConf presentation on "Making the Web Faster", and the web as a (future) presentation delivery platform.
RailsConf 2012 slides for Making the Web Fast(er) one Rails page at a time - tools and tips for optimizing your pages using a variety of Google tools (aka, the tools we use at Google to optimize our products).
Chrome's remote debugging allows you to easily drive the browser via a WebSocket: interact and modify the DOM, listen to network events, instrument the V8 VM, and much more! A hands on example of driving Chrome via Ruby + WebSockets.
If your job is to think about web performance, then you need to approach it from a users perspective: use Navigation Timing to measure true latency, leverage Site Speed reports in Google Analytics, and focus on the shape (distribution) of the performance data! A hand's on look at the spec, and sample GA reports.
JDK7's Fork/Join combines a double ended queue (deque) and a recursive job partitioning step to minimize synchronization - a great design pattern to keep in mind for single host and distributed cases! A look under the hood of the Fork/Join framework, and a few JRuby examples which will light up all of your available cores.
LevelDB combines the SSTable, MemTable a number of processing conventions to create a fast, open-source database engine. LevelDB is now embedded in WebKit (IndexDB), but you can also easily embed it in your own Ruby application!
WebSockets, SPDY, SSL, and persistent connections are in, except that our infrastructure can't support most of these use cases. To enable the modern, real-time web, we need to drag our 'back office' architectures into this century.
Google officially supports JavaScript, GWT, Closure, NaCl and Dart: how do these project play together, why the competition? What's the one true "Google" way to build a modern web application in 2012?
Don't "push" your pull requests - good work gets "pulled". Avoid the trap of working in the dark and being surprised when your contributions don't get merged.
Is it possible to securely route SSL sessions via an HTTP Proxy? A look at the existing HTTP specification, and what the SPDY spec and Google Chrome team bring to the table - hint, the answer is yes. As of January 2012, over 50% of all internet sessions will be SPDY capable, and Web-VPN is one of the many great features it brings.
For most web-browsing use cases, an internet connection over several Mbps offers but a tiny improvement in performance - don't waste your money on that high-bandwidth connection! Oh, and learn a few tips on how to build a faster web and web-services.
Do use reuse HTTP connections in your code? Does your app server support pipelining? You can speed up your code and your apps by orders of magnitude if you answer those questions correctly. A look at the HTTP internals, and Ruby libraries and solutions.
Server-Sent Events are an HTML5 feature which allows you to easily push real-time notifications from the server to the client! Why not use a websocket? Turns out SSE offers a much simpler API optimized for one way push. A quick look at an SSE API and a Ruby/Goliath server implementation.
Rails 3.1.0 is on the horizon and Asset Pipeline is the king of the show. A hands on look at some of the internals + example of extending the Asset Pipeline to support Google's Closure library.
Which is the best serialization format? Protocol Buffers from Google, Facebook's Thrift, MessagePack, or maybe Avro? A look at the use cases and the historical context in which each was developed is instrumental in helping us answer this question.
Instant page loads without any network latency delays? Why, yes you can with the new browser pre-rendering support and page visibility API's. No Javascript required.
Heroku's new Cedar stack allows us to deploy any ruby app to their cloud! Which means, we can now deploy async Goliath Ruby apps with near minimal effort: streaming API's, async API endpoints, etc, all without any of the callback mess!
Hadoop batch-processing is not the panacea to every problem. StreamSQL allows us to easily filter, aggregate, and even merge multiple realtime streams to detect correlations, run custom calculations, and much more - all without extra code! A quick intro to StreamSQL, the Esper engine, and a JRuby example to apply it to a real-time Twitter stream>.
State of art for reverse proxies (nginx, haproxy, etc), is terrible: need to add a new appserver? Modify the config, reload - ugh. An experiment with Goliath, SPDY and 0MQ to build a zero-config reverse proxy.
Machine learning is hard, right? It doesn't have to be. Focus on developing intuitive insights, get more data, and get started! A video and Ruby code examples from a talk at GoGaRuco 2010.
VMware announced their CloudFoundry project earlier this week, which is an open source PaaS platform - run your own "mini Heroku" or "EY cloud" on your own servers! The entire platform is powered by a collection of distributed Ruby daemons and services - great case study of building a distributed Ruby system.
If you are using Google's Chrome broweer, and you are using Google web services today, chances are, you are not running over HTTP! More likely, you're running over SPDY. A look under the hood of SPDY and a Ruby parser for SPDY.
Mneme is an HTTP web-service for recording and identifying previously seen records - aka, duplicate detection. It is powered by Goliath, Redis, and a collection of bloomfilters under the hood. Check out the source on github.
Goliath is an open source version of the non-blocking (asynchronous) Ruby web server framework powering PostRank. It is a lightweight framework designed to meet the following goals: bare metal performance, Rack API and middleware support, simple configuration, fully asynchronous processing, and readable and maintainable code (read: no callbacks).

Quick introduction to the features, API, and why it exists!
Ruby 1.9 packs in a lot of new and improved features at all levels: an overview of 30 new features, tips & tricks to help you take full advantage of the new runtime.
HandlerSocket is a plugin for MySQL which adds the "NoSQL" directly into MySQL. End result? Direct access to your index, faster than memcached, and full power of SQL!

A hands on look at HandlerSocket and available ruby drivers.
Ruby was influenced by languages such as Perl, Lisp and Smalltalk, and now it is influencing an entirely new set of languages such as Mirah, Reia and Rite.
Should we remove threads in Ruby? What are some "advanced concurrency models"? A look at Actors vs CSP/pi-calculus and Agent gem which models Go-concurrency in Ruby.
ZeroMQ allows the programmer to assemble high-performance, in process fanouts, queues, and other messaging patterns required for building high-performance applications. ZDevice is a new Ruby library & DSL which simplifies this process.
If you ever needed to add text-indexing or search to your application, then Lucene & Solr should be on top of your radar. A detailed look at the ecosystem and current users of both projects.
ZeroMQ is a network library which provides a much needed layer of abstraction on top of the traditional BSD socket API: transport agnostic, connection management, routing. With language bindings for a dozen languages (and Ruby, of course), it is a fast, modern API which makes developing high-performance network apps fun again.
It's not a question of whether threads, events or message-passing is a better model - the hardware trends require that we use all of the above. Either the VM (like Ruby) has to abstract it all for us, or we need to build frameworks to match the capabilities of the hardware.
Rails 3 release is on the horizon, and Railtie is what ties all the new modules together. A look at how it works, and how it enables us to build super simple and easy plugins for Rails 3!
Google's Speed Tracer instruments the browser and the V8 VM to show you what the browser is doing: GC, reflow, etc, such that you can optimize the performance of your code. A recent feature is the ability to also bring in server-side traces! Rack-speedtracer is a middleware which allows any Rack compatible app to surface its runtime information in Speed Tracer.
CAP theorem says that we can't have consistency, availability, and partition tolerance all at once: pick two! How does that affect all of the NoSQL databases and our architectures? Well, it is far more nuanced than just pick any two.
The state of art in the end-to-end Rails stack performance is not good enough. We need to fix that, and to do so, we need to revisit our app server model, as well as everything that touches it.
Beanstalk is a fast, in-memory work queue system - a memcached of work queues. A look at the features, advanced recipes, and its use at PostRank, with Ruby examples, of course.
Zookeeper is a distributed lock and metadata store originally incubated within the Hadoop umbrella of services. However, it is also generally useful for solving distributed problems, and is designed to be highly available and scalable -- a quick look at the architecture, API's, and working with it in Ruby.
The mysql gem is one of the worst offenders when it comes to performance of Rails. A look under the covers of the driver architecture & available alternatives.. Followed by a demo of an async ActiveRecord driver! (MySQLConf presentation)
Event driven programming does not have to be complicated. With a little help from Ruby 1.9 Fibers, and new library em-synchrony, much of the complexity is easily abstracted, which means that we can have all the benefits of event-driven IO, without any of the overhead of a thread scheduler or complicated code.
There is no reason why we can't have a schema-free MySQL engine to compete with the NoSQL solutions. A look at what "schema-free" and "document-oriented" actually means, and the ruby code to make it work.
Apache Avro is an RPC + data serialization library used by the Hadoop project (and a competitor to Thrift, Protocol Buffers, etc). A look at the Ruby interface and the features of Avro.
An in-depth look at the architecture of Ganglia (performance monitoring) for your cloud service / app, and how connect it to your Ruby application - a new gmetric gem.
MagLev Ruby VM has a unique persistence model: it is a database and a ruby runtime in one. Based on the smalltalk VM, it offers a JIT, transparent object persistence, distributed layer, and many other goodies. A on overview of the state, architecture and examples of use.
Working with large streams of data is becoming increasingly widespread, be it for log, user behavior, or raw firehose analysis of user generated content. Time-based bloom filters are just the tool for the job & the bloomfilter gem now implements a Redis-backed, counting, time-based bloom filter!
WebSockets are one of the most underappreciated innovations in HTML5 - bi-directional, fully asynchronous and data agnostic data exchange. Dev builds of Google Chrome, Firefox and Safari now all support WebSockets, which means that as developers we start taking advantage of the new architectures. A hands on look at the API, and implementation of Ruby WebSocket server and clients.
Rumors of the demise of relational database systems are greatly exaggerated. While the new NoSQL storage engines are exciting to see, it is also important to recognize that relational databases still have a bright future ahead - RDBMS systems are headed into main memory, which changes the playing field all together.
There are 8 alternative Ruby VM's and 4 of them will hit 1.0 status in the upcoming year. A detailed look at the past year and where the community is heading (hint: it's an exciting time to be a Rubyist).
A how-to for using blather and switchboard gems in Ruby to work with XMPP PubSub streams.
Nginx can be converted into a fully capable long-polling Comet server with a single plugin. Best of all, it is fully asynchronous, supports message queuing, memory limits, and is easy to get started with. A look at the configuration and a sample ruby client.
AMQP & RabbitMQ are industrial grade message routers with support for failover, load balancing, pubsub, you name it! A look at the available brokers, features, and integration with Ruby.
Google App Engine has all the potential to become a popular deployment platform for Ruby applications: Sinatra, Rack, Rails, etc! A look at the tools and gotchas of migrating to GAE.
Ensemble methods are proving to be very effective for doing collaborative filtering (product recommendations, etc) as the results from both the Netflix and the recent GitHub challenge clearly show. A look at what ensembles are, and how they can be used (in Ruby, of course).
Accessing post-Javascript DOM (server-side) with Aptana Jaxer. Want to build a spider which sees the page as it is seen by the user? This is it.
Most web applications are built with the assumption that the client / browser is 'dumb', which places all the 'scalability' and load on the server. But, what if, the browser was smarter? A look at ReverseHTTP & WebSockets APIs.
Squid cache server can help you mask slow application servers and intermittent downtime via stale-while-revalidate and stale-if-error extensions. A hands on look at configuring Squid and connecting it with a simple Rack application.
Tokyo Cabinet/Tyrant database can be scripted via Lua extensions: inject new functionality, application or data logic, etc. A look at the available extensions (auto expiry, working with sets, wordcount map-reduce example) and how to get started (with Ruby).
A look at the emerging Webhooks & PubSubHubbub HTTP callback implementations and how they enable easy, distributed, web-scale PubSub.. with Ruby code samples and example of new PubSubHubbub Ruby gem.
Designing your application to work around IO bottlenecks is tricky business. A look at measuring & optimizing for I/O performance with iostat + a refresher on disk latencies and the underlying bottlenecks.
Using Google Perftools to profile and visualize the codepaths of any Ruby application.
Hadoop Streaming allows you to use Ruby to easily create and run Map-Reduce jobs on top of Hadoop - learn how!
Continuations (known as Fibers) in Ruby 1.9 give us the ability to do cooperative scheduling, and implement synchronous API's on top of async libraries, which are both much more efficient and avoid the complexity of threading. A hands on look at the theory, code, and applications.
Building a high-performance Ruby proxy server is both remarkably simple, and a very powerful technique: intercepting arbitrary data, extending protocols & injecting new functionality, ... It's a wonderful hammer, learn how to use it!
Henry Ford's answer for 'scalable infrastructure', also commonly known as Event Driven Architecture.
A hands-on look at sorting performance, and Trie, Priority Queue and Heap data structures implementations and use cases in Ruby - drop in replacements for performance and better memory usage!
How-to for transparently connecting Ruby and Erlang VM's with erlectricity gem. Get the best of Ruby, and the functional programming of Erlang without sacrificing either.
Instead of using proprietary protocols, what if the barrier to entry for assembling a compute cluster was clicking a link? We can use the browser (javascript) to perform the work, and HTTP to coordinate the workflow. All we need is javascript and a 30 line sinatra web-server.
Scrooge is an automated application layer query optimizer, which tracks usage of your attributes and then dynamically rewrites the ActiveRecord queries to only select those columns.
A hands on look at the Tokyo Cabinet database project by Mikio Hirabayashi (powering - it is an incredibly fast, and feature rich database library, with Ruby libraries, and memcached and RESTful protocol support.
A look at the internals of the Ruby 1.9 Hash: how it works, how it compares to 1.8, and alternatives.
Turns out '08 was a banner year for Rails, and here's video proof. Also, visualizations and code to produce visualizations for any GitHub contributor.
A hands-on example of using the new FFI gem developed by Wayne Meissner to interface with native (C) extensions in a uniform way for Ruby MRI, JRuby and Rubinius.
When you're working with large datasets it's always nice to have a few algorithmic tricks up your sleeve, and Bloom Filters are exactly that - often overlooked, but an extremely powerful tool when used in the right context. Learn the theory and get started with Bloom Filters in Ruby.
An AST to a program is what the DOM is to a web-page - a hands on look at working with the Ruby's Abstract Syntax Tree and the projects that are using it. Ever wanted to translate Ruby to Lolcode?
Forget the maintenance page, HAProxy allows you to perform zero-downtime releases on the fly - a how-to for configuring the release procedure.
Instead of thinking in threads, you should think about process parallelism, due to the Global Interpreter Lock - a look at what that means and why.
Scaling ActiveRecord with MySQLPlus and ConnectionPool: A hands on look at the new ConnectionPool interface in Rails 2.2RC1, and the MySQLPlus gem which promises asynchronous processing.
A how-to guide for connecting syslog-ng and Splunk to get your logs, from all over the network into a central Splunk server (Ruby, Nginx, HAproxy, etc!)
Stale caches causing inconsistent user experience and response times? Yahoo developers proposed an extension (stale-while-revalidate) to address this problem. We implement a proof of concept Ruby caching server for this pattern.
A hands on look at the available tools, and techniques to do log replay to simulate a collection of users + release of a Ruby driver to automate these tests.
A how-to for taking advantage of the Cloud to do seamless migrations and infrastructure updates with the help of dynamic DNS.
Benchmarks of the latest MySQLPlus, EM/MySQL and DBSlayer libraries for asynchronous database access in Ruby - with surprising and promising results!
Can a Rails app serve 3 million dynamic pageviews a day? Absolutely, is doing it on a daily basis.
Make use of UNIX signals to easily toggle debug mode on any process.
Splunk your distributed logs in EC2 (or anywhere else) for easy management, and debugging of your Ruby apps.
Hands on tutorial / look at Ruby EventMachine (the speed demon).
Hands on how-to for optimizing response times and coordinating traffic between multiple clusters of app. servers, with the help of HAProxy.
Notes on memcached internals from MySQL Conf '08.