As a dev tool engineer, spawning sub-processes was part of my daily job. I got used to doing it in Node.js with the [child\_process](https://nodejs.org/api/child_process.html)
module from the standard library.
I recently switched to a system programming position (still in the dev tool world). I had to change my daily companion from Node.js to C++.
Then I had to learn how to spawn child processes in C++. Even if the terms and concepts are similar, the ergonomics of the C API we need to use for this purpose could fear people from a web background.
This article aims to unveil gray areas and make them easier for you!
✅ What you’ll be able to do at the end of this article?
❌ What won’t this article cover?
stderr
or stdout
Ready? Let’s start our journey by defining terms.
When you launch a program (binary) on your computer, the OS will create a process object stored in memory. This object contains a state (new, ready, running, waiting, terminated).
According to the machine resource and availability, the OS scheduler executes the program and puts the process into different states.
If you want to dig into this topic a bit more. I advise you to this video:
Now it’s time to see which API we’ll use to spawn a process in C++.
There are probably different approaches to spawning a process in C++. My first web research brought me to posix_spawn
/posix_spawnp
functions.
According to the man:
The posix_spawn() and posix_spawnp() functions are used to create a new child process that executes a specified file.
“Executing a file?” (a voice pops into my mind)
Yes, in this context, file means your executable.
“Okay, but what is the difference between posix_spawn() and posix_spawnp()?”
The difference is about the second argument. In the posix_spawn
case, the argument should be a path to the executable (absolute or relative), but in the posix_spawnp
case, the executable is specified as a simple name.
posix_spawn -> "/usr/bin/curl"
posix_spawnp -> "curl"
In the latter case, the system will search for the executable in the list of directories stored in the PATH
environment variable. Now let’s take a look at the API itself:
In the following listing, you’ll find the posix_spawnp
API description.
int posix_spawnp(pid_t *restrict pid,
const char *restrict file,
const posix_spawn_file_actions_t *restrict file_actions,
const posix_spawnattr_t *restrict attrp,
char *const argv[restrict],
char *const envp[restrict]);
We will skip parameter number 3 and number 4 as, during my personal experience, I didn’t have to deal with them.
There are a few caveats:
argv
: the first item should be the same as file
argumentargv
: the last item should be 0
(Look at execve
documentation for more details)envp
: should be declared as extern char** environ;
in the file because it will make available by execve(2)
when a process beginsLet’s try to implement a basic version!
Let’s say that we want to spawn a basic curl
command:
#include <cstdio>
#include <cstdlib>
#include <errno.h>
#include <spawn.h>
#include <sys/wait.h>
// NOTE: made available by execve(2) when a process begins
extern char **environ;
int main() {
pid_t pid; // #1
char *args[] = {"curl", // #2
"https://jsonplaceholder.typicode.com/posts/1",
0};
int status = posix_spawnp(&pid, // #3
args[0],
nullptr,
nullptr,
args,
environ);
int s = waitpid(pid, &status, 0); // #4
if (s == -1) {
errno = status; // #5
perror("posix_spawn");
exit(EXIT_FAILURE);
}
return status > 0 ? EXIT_FAILURE : EXIT_SUCCESS;
}
What we have in the body of main
:
#1
: First, we declare the argument pid
as pid_t
#2
: Then we declare an array of strings (named args
) that contains the program we want to use, followed by arguments.#3
: We invoke posix_spawnp
function with pid
and args
and store the result into a status
variable.Note that we use
args
twice: once for thefile
argument and then forargv
the argument.
#4
: The last part is about waiting for the process to end with the waitpid
function (sys/wait.h
header) that will return a -1
in case of error.#5
: We want to display the error message in case of an error. It’s why we set the errno
variable from errno.h
header and then call perror
with a label. Then it should print something like: posix_spawn: <error-message>
If you want to play with it, visit this link.
Now let’s reshape the whole implementation. This version is not reusable at the moment.
First, we should wrap the body of our function into a spawn
function:
#include <cstdio>
#include <cstdlib>
#include <errno.h>
#include <spawn.h>
#include <sys/wait.h>
// NOTE: made available by execve(2) when a process begins
extern char **environ;
int spawn(char *args[]) {
pid_t pid;
int status = posix_spawnp(&pid,
args[0],
nullptr,
nullptr,
args,
environ);
int s = waitpid(pid, &status, 0);
if (s == -1) {
errno = status;
perror("posix_spawn");
exit(EXIT_FAILURE);
}
return status > 0 ? EXIT_FAILURE : EXIT_SUCCESS;
}
int main() {
char *args[] = {"curl",
"https://jsonplaceholder.typicode.com/posts/1",
0};
return spawn(args);
}
That’s not enough! Ideally, we’d like to mimic the Node.Js API and be able to pass the command and arguments separately. Something like: spawn("curl", ["https://www.google.fr"]);
Also, instead of using old C strings and arrays, we could use std::string
& std::vector
.
#include <algorithm>
#include <cstdio>
#include <cstdlib>
#include <errno.h>
#include <spawn.h>
#include <string>
#include <sys/wait.h>
#include <vector>
// NOTE: made available by execve(2) when a process begins
extern char **environ;
// #2
std::vector<const char *> format_args(const std::string &command,
const std::vector<std::string> &arguments) {
// NOTE: we need two more slot:
// - one for the command itself
// - another for the last "0" item
std::vector<const char *> cstr_args(arguments.size() + 2);
std::transform(std::cbegin(arguments), std::cend(arguments),
std::begin(cstr_args) + 1,
[](const auto &v) { return v.c_str(); });
cstr_args[0] = command.c_str();
return cstr_args;
}
int spawn(const std::string &command, const std::vector<std::string> &args) {
pid_t pid;
const std::vector<const char *> cstr_args = format_args(command, args);
// #3
char *const *raw_args = const_cast<char *const *>(cstr_args.data());
int status = posix_spawnp(&pid,
raw_args[0],
nullptr,
nullptr,
raw_args,
environ);
int s = waitpid(pid, &status, 0);
if (s == -1) {
errno = status;
perror("posix_spawn");
return EXIT_FAILURE;
}
return status > 0 ? EXIT_FAILURE : EXIT_SUCCESS;
}
int main() {
const std::string command = "curl";
const std::vector<std::string> args{
"https://jsonplaceholder.typicode.com/posts/1"};
int status = spawn(command, args); // #1
return status;
}
What we have in this new version:
#1
: the API of the spawn function now takes two parameters, one for the command (std::string
) and another for the arguments (std::vector<std::string>
).#2
:then the spawn function has to format this input to fit into an old-fashioned array of strings. That’s the purpose of the format_args
function that returns a vector of C strings#3
:finally, we have to convert this vector into an old-fashioned array; it’s trivial because the std::vector
expose a .data()
the method that makes that conversion easier.💡If you are not familiar with c++, consider just the format_args
function as a magic box that converts our arguments into an old-fashioned C array.If you want to play with the full version:
Congratulation! You managed to finish this post and you’re now ready to spawn processes in C++.
Let me know in the comment section if you want to see a Windows version of this post.
If you want to read more about spawning process, I advise you to read:
If you want to understand how I convert a std::vector<std::string>
to a char *args[]
, please take a look at: