Getting Started with Ray Core

Many of Ray’s concepts can be explained with a good example, so that’s exactly what we’ll do. As before, you can follow this example by typing the code yourself (which is highly recommended), or by following the notebook for this chapter. You can run this notebook directly in Colab:

In any case, make sure you have Ray installed:

! pip install "ray==2.2.0"

A Ray Core Intro

In Chapter 1 we showed you how start a local cluster simply by calling import ray and then initializing it with ray.init(). After running this code you will see output of the following form. We omit a lot of information in this example output, as that would require you to understand more of Ray’s internals first.

... INFO services.py:1263 -- View the Ray dashboard at http://127.0.0.1:8265
{'node_ip_address': '192.168.1.41',
...
'node_id': '...'}

The output of this command indicates that your Ray cluster is functioning properly. As shown in the first line of the output, Ray includes its own dashboard that can be accessed at http://127.0.0.1:8265 (unless a different port is listed in the output). You can take some time to explore the dashboard, which will display information such as the number of CPU cores available and the total utilization of your current Ray application. If you want to see the resource utilization of your Ray cluster within a Python script, you can use the ray.cluster_resources() function. On my computer, this function returns the following output:

{'CPU': 12.0,
'memory': 14203886388.0,
'node:127.0.0.1': 1.0,
'object_store_memory': 2147483648.0}

To use the examples in this chapter, you will need to have a running Ray cluster. The purpose of this section is to give you a brief introduction to the Ray Core API, which we will refer to as the Ray API from now on. One of the great things about the Ray API for Python programmers is that it feels very familiar, using concepts such as decorators, functions, and classes. The Ray API is designed to provide a universal programming interface for distributed computing, which is a challenging task, but I believe that Ray succeeds in this by providing abstractions that are easy to understand and use. The Ray engine handles the complicated work behind the scenes, allowing Ray to be used with existing Python libraries and systems.

This chapter begins with a focus on Ray Core because we believe it has the potential to greatly enhance the ease of access to distributed computing. The purpose of this chapter is to give you an in-depth understanding of how Ray functions effectively and how you can grasp its basic concepts. It is important to note that if you are a Python programmer with less experience or prefer to concentrate on more advanced tasks, it may take some time to become familiar with Ray Core. However, we highly recommend taking the time to learn the Ray Core API as it is a fantastic way to start working with distributed computing using Python.

Your First Ray API Example

To give you an example, take the following function which retrieves and processes data from a database. Our dummy database is a plain Python list containing the words of the title of this book. To simulate the idea that accessing and processing data from the database is costly, we have the function sleep (pause for a certain amount of time) in Python.

import time

database = [
    "Learning", "Ray",
    "Flexible", "Distributed", "Python", "for", "Machine", "Learning"
]


def retrieve(item):
    time.sleep(item / 10.)
    return item, database[item]

Our database has eight items in total. If we were to retrieve all items sequentially, how long should that take? For the item with index 5 we wait for half a second (5 / 10.) and so on. In total, we can expect a runtime of around (0+1+2+3+4+5+6+7)/10. = 2.8 seconds. Let’s see if that’s what we actually get:

def print_runtime(input_data, start_time):
    print(f'Runtime: {time.time() - start_time:.2f} seconds, data:')
    print(*input_data, sep="\n")


start = time.time()
data = [retrieve(item) for item in range(8)]
print_runtime(data, start)

Runtime: 2.82 seconds, data: (0, 'Learning') (1, 'Ray') (2, 'Flexible') (3, 'Distributed') (4, 'Python') (5, 'for') (6, 'Machine') (7, 'Learning')

The total time it takes to run the function is 2.82 seconds, but this may vary on your individual computer. It's important to note that our basic Python version is not capable of running this function simultaneously.

You may not have been surprised to hear this, but it's likely that you at least suspected that Python list comprehensions are more efficient in terms of performance. The runtime we measured, which was 2.8 seconds, is actually the worst case scenario. It may be frustrating to see that a program that mostly just "sleeps" during its runtime could still be so slow, but the reason for this is due to the Global Interpreter Lock (GIL), which gets enough criticism as it is.

Ray Tasks

It’s reasonable to assume that such a task can benefit from parallelization. Perfectly distributed, the runtime should not take much longer than the longest subtask, namely 7/10. = 0.7 seconds. So, let’s see how you can extend this example to run on Ray. To do so, you start by using the @ray.remote decorator

import ray


@ray.remote
def retrieve_task(item):
    return retrieve(item)