Become an rpk pro in just 10 minutes

Ready to get savvy with Redpanda's friendly command line interface (CLI)? From managing clusters to security and access control, this video walks you through all the cool rpk features and how you can master them in no time.

Featuring: Christina Lin, Director of Developer Advocacy, Redpanda.

What's in the video

  • Introduction to rpk
  • Install rpk
  • Autocompletion
  • Local container
  • Cluster configuration
  • Partition balancing
  • Broker configuration
  • Development and Production mode
  • Tuning
  • Topic configuration
  • ACL (Secure)
  • Managing consumer groups
  • Generate templates

Rather read the transcript? Dig in

Christina Lin (0.00)
Hey there! Today we're about to explore something really exciting in Redpanda – the rpk tool. This is what you will use to interact with a Redpanda cluster. Whether you’re an administrator that looks after the cluster or a developer that connects to it, rpk is going to be your best friend.

Installing rpk varies depending on your operating system. I'm going to link the documentation down below.

Before you get started with rpk, one super helpful thing is autocompletion. It was released in the latest 23.2 version. You don’t need to install it, but if you want to enable it, you'll need to install it beforehand.

(0.50)
Let's look at what is in rpk. Here are all the first-level commands. Let's go for the more obvious one – rpk version. This just shows you what version of rpk you're running. Make sure your rpk aligns with your rpk brokers for the best result.

So you need to get a cluster going. That's why you’ll find the rpk container super handy. It helps you quickly set up a cluster of Redpanda on your local computer using Docker underneath the hood.

When you're done with the cluster, simply run the purge command and it will clean up and undo everything.

A Redpanda cluster consists of multiple nodes or brokers working together as a distributed system to balance and share the load. To interact with the cluster as a whole we'll use rpk cluster command.

The majority of configurations are done at the cluster level so if you don't know what other available configuration simply type rpk cluster config and it will show you what the options are and their current value.

(1.49)
The cluster level configurations consist of several parts. Things like local and tiered storage where log retention related configurations are used to determine when to delete or offload data to the cloud storage.

There are Kafka admin APIs to set the timeouts, limiting the connection traffics rates for consumers, configuring internal RPC limits, and log segments.

You can also configure Matrix auto rebalancing item potencies and also enabling transactions and configure its coordinator settings.

If you want to set the default value for topics and petitions related across the entire cluster, you can also do it here. And of course you can also set each topic individually. We’ll see that later.

The rpk cluster command is essentially our main tool for managing all things related to cluster level config or any operations in that level. Say you need to upgrade your broker or node versions – what you do is use the rpk cluster maintenance command.

This puts a broker into maintenance mode and automatically reassign leaders to other active brokers. Then you can quickly perform an upgrade and bring it back online when it's done.

(3.03) It's pretty neat for diagnosing cluster problems too. Sometimes you've got all sorts of things causing the cluster to slow down or act up. So you've got this Nifty rpk cluster self test command it goes ahead and checks on how the broker discs are performing and even tests the networks between them.

That makes troubleshooting a whole lot easier for the admins. Another useful trick up our sleeve is the cluster partition balancer status command. This focuses on making sure the clusters are running at optimal capacity. This helps us figure out problems like when the cluster is busy reassigning and rebalancing due to a broker or node outage.

Now that we are done with configuring the Redpanda broker configuration cluster as a whole, what about for each individual Redpanda broker?

This is when rpk Redpanda comes into play. Let's start with the most used one for me, the rpk Redpanda admin broker list. It lists the brokers in your cluster. It’s super helpful if you first just want to see the high level view of the formation of the cluster, what their versions are, and a quick view of the hardare assignment – also if it's actually alive.

(4.09)
When using the rpk Redpanda think of the operations that you want to do with a single broker in the cluster. One of the common ones is adding or removing brokers to or from the cluster. And tracking the status of the progress.

Of course you can set up the broker settings easily. For instance with data directory you decide where rpk keeps its files. The seed server helps your broker join an existing group while the rack lets you set the failure zone if you're using follower fetching.

And if you ever need to adjust port for anything like Kafka rpcs or admin just use the rpk Redpanda config to configure them. There's also a crash loop limit which ensures your broker doesn't keep crashing repeatedly by setting a limit to how many times it can crash in an hour.

(5.00)
Lastly there's a note ID. Sort of a name tag for the broker in its group. It's usually best to let Redpanda assign this ID itself.

Redpanda has two modes: Development and Production mode. With the dev mode it's more flexible about the hardware it uses. It doesn't require a strict demand for memory, and cores, and it eases up things like thread connections, idle check-ins, and being continuously active with disk tasks

Setting the dev production mode, decommission, recommission, or sometimes changing some of the broker configurations require restart of the Redpanda broker.

We can use the rpk Redpanda start and stop to restart the brokers individually. For admins out there alter tuner is something that you must know. If you're using Linux and aiming for production settings this is your go-to. But keep in mind you need root access.

With the command rpk Redpanda tune, Redpanda gets to know your machines hardware and adjust Linux kernel broker settings to get the best out of Redpanda.

It checks things like the disk IO, the networks, the CPUs, and helps to get everything in the most optimized way.

(6.08)
Another command is rpk IO tune which focuses on enhancing the input and output performance for a specific Redpanda setup. We’ll probably dive deeper into that in another auto tuning video.

As a developer my goto rpk command is definitely the rpk topic. It's very self-explanatory as it deals with all topics-related needs.

This includes tasks from creating and deleting topics to monitoring topic statuses and adjusting settings such as disk retention policies and adding partitions. The cluster will dynamically trigger reassignment of leaders for load balancing and fault tolerance purposes.

With topics, produce and consume is my go-to command for testing. The produce command is very underrated. It’s incredibly versatile. Not only can you input data manually from a file but the topic produce also lets you organize inputs using the format flag.

This means you can easily read from entire files and send the data into a topic. You can also choose to split data by byte size or specific limiters.

(7.16)
When it comes to consuming data similar to data production you have the options to specify data format during consumptions and configure the range and criteria for data retrieval.

Occasionally you may encounter situations where a single topic is consuming an excessive amount of local disk space. In this case the trim prefix feature can assist in maintaining a balance. This feature allows you to selectively remove records from a specific topic freeing up space to accommodate other data.

If you manage different clusters sometimes it gets confusing. Profile helps you keep everything organized. It makes switching between clusters easier and less prone to mistakes.

(8.01) You can create new profiles by specifying the broker address admin host a nice trick for using the profile is to specify different colors in the prompt so you have a better idea of which cluster you're connecting to.

Another useful one is the ACL command. This helps you secure your Redpanda cluster by creating credentials, setting up rules, giving access, and setting permissions. And of course you can do that in the UI as well.

The rpk groups command is great for dealing with consumers. You can see a list of all consumers connecting to your clusters, manage them, and even change the offsets.

There's also rpk generate. Instead of using rpk command to work with the Redpanda cluster this is used to generate templates for related services surrounding Redpanda. These include generating templates for Grafana dashboard or for Prometheus configurations for scraping metrics.

Last but not least if you're using any of the Redpanda Cloud options, you can simply use the rpk login to log directly into the cloud and start interacting with the cluster.

For bring your own cloud users rpk is used for initiating and kicking off creation of your clusters.

These are just a fraction of what rpk can do. I hope you enjoyed the quick overview. Let me know in the comments what's your favorite use of rpk. And as always please subscribe and see you in the next video.

Flex your new rpk skills

Want to keep learning?

Check out our other tech talks!