Machine learning made possible with zero background

Image for post
Image for post

Machine learning has been the foreground of the rising popularity of data-driven software systems. However machine learning has a comparative high entry barrier. Shortage of Machine Learning expertises could potentially prevent many companies from benefiting from Machine Learning. Facing such challenges, both the academics and the industries have invested heavily in automating the machine learning process. Oracle’s AutoML is one such example.

The classic machine learning process

The machine learning process usually starts with data collection and preprocessing. We then need to select an algorithm from a vast amount of possible machine learning algorithms. Some examples are logistic regression, random forest, support vector machine, convolution neural network and currently-popular transformer. After an algorithm is selected, we need to optimize the hyper-parameters and parameters of the algorithm. …


What it is about and why it matters to us?

Amazon Web Service (AWS)’s annual re:Invent conference is one of the most anticipated event in the global cloud computing community. This year, it is a 3-week conference starting from November 30 to December 18. Like many other events, it is being held virtually.

Though the event is still taking place, many new products and features have already been announced. In this post, we have summarized for you the biggest announcements made so far and how we may benefit from them.

Looking at the new releases this year, we can see there are two main areas in which AWS has heavily invested in, integrating machine learning into cloud application and simplifying cloud development journey. Below here, we have listed a number of exciting products for both areas, in no particular order. …


How Facebook designed a database with 10X write throughput and 2X compression rate compared to InnoDB

Image for post
Image for post

Databases are at the heart of virtually all internet applications. Database throughput and latencies are normally the bottleneck of many internet services and have attracted attentions from many researchers. As the amount of data grows exponentially, it is more and more important to store the data efficiently. In this blog, we will introduce Facebook’s MyRocks database and examine how it improves the write throughput by 10X and compression rate by 2X compared to InnoDB.

The architecture of InnoDB

InnoDB, like other classic relational database, uses B-Tree to organize its data. A B-Tree is similar to a binary search tree with a few…


How your social media data on Facebook are organized and stored

Image for post
Image for post

There is no need for an introduction to Facebook. Facebook has more than 1 billion active users who record their relationships, share their interests, upload and comment on text, images and videos. In this blog, I will mainly discuss two aspects of Facebook’s backend system:

  • How the social graph is modeled and stored, that is, the database schema adopted by Facebook.
  • How Facebook scaled its infrastructure, including servers, cache and databases, to serve 1 billion reads and millions of writes per second.

The Data Model For Social Graph

Facebook stores majority, if not all, of users’ data, such as profiles, friends, posts and comments, inside a single giant social graph. …


Part 2: Caching

Image for post
Image for post

This is the second series of the blog, how to scale a web service efficiently. The first part, which contains introduction and patterns to scale databases, can be found here.

Pattern 3 Cache

Cache is one of the mostly widely used approaches to scale a system. Compared to the database, cache provides smaller but faster storage. It takes advantage of locality: recently used data is likely to be used again. Therefore we can store recently accessed data in the cache and it will save us time by reading from cache when we need the data again in the near future. …


Part 1: Scale the database layer

Image for post
Image for post

Regardless of working as a freelancer, starting a Startup, or being employed in the tech industry, we are likely to work with some web services. Web services normally start with a simple design and have a small number of users. As our web services become popular, we need to scale them. However, scaling up a web service is extremely complicated since the design of a web service targeting 1,000 users is dramatically different from that of a web service targeting 1,000,000 users. What’s worse, we often need to scale the web service within a very short time frame. For example, your Startup could become super popular overnight and driving 10x more users to your website. Under those circumstances, we need to be very conscious of the change we make. …


Image for post
Image for post

DynamoDB is a NoSQL database provided by Amazon Web Service (AWS). It can provide extremely high performance, more than 10 trillion requests per day with peaks greater than 20 million requests per second, and can support virtually any size with horizontal scaling. It is not uncommon for DynamoDB to serve over petabytes of data.

In this article, we will start with a brief introduction to the APIs of the DynamoDB. We will then look into the architecture of DynamoDB and explain in detail why DynamoDB can provide such high performance.

From a 10,000 feet view, DynamoDB is basically a key-value store. It can be thought as a hash-map backed by some persistent storage. Two most important operations supported by DynamoDB are Get and Put. …


Abstract

Aurora Database is Amazon’s cloud-native database. It can hold up to 64TB of data and is much faster than MySQL database. Many companies have adopted Aurora Database.

In this article, we will introduce the architecture of Amazons’ Aurora Database. We will start with the architecture of a traditional relational database system, such as MySQL database and discuss their limitations. We will then discuss how Aurora Database extends the functionalities of a traditional database to improve the availability, reliability and scalability.

If you are interested in the architecture of DynamoDB, another featured database from AWS, you can read it from my other blog [Click here].


Previous: Write Your Own OS (4) — Boot Process [https://medium.com/@megtechcorner/write-your-own-os-4-boot-process-b7cb7bef2fcb]

Next: Write Your Own OS (6) — Control the cursor (Coming soon)

— — — — — — — — — — — — — — — — — — — — —

Part 1.2 Print to screen

In this section, we will extend our Operating System to print to the screen and move the cursor to the appropriate location. First we will set up a stack for the Operating System so that we can switch to the C programming language. Next we will print to the screen using framebuffer. …


Previous: Write Your Own OS(3) — Bare Bone OS

Next: Write Your Own OS(5) — Print to screen

Part 1.1.3 Boot Process

In this part of the tutorial, we will introduce the booting process of the computer, from the time we tap on the power button of the computer to the time our Operating System begins to run. This process involves a chain of programs. One example is illustrated in the figure below. When the computer first powers on, it is “hardcoded” to run a small program, BIOS. BIOS then loads and calls the entry function of a larger program, bootloader. The bootloader then loads our Operating System into memory and calls the entry function. (Line 19 of start.S …

About

Meg's tech corner

80% System; 20% ML

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store