Working with CARLA x Cosmos Transfer
CARLA can be connected to NVIDIA Cosmos Transfer to create hyper-realistic variations of the synthetic data generated in CARLA.
In this integration, CARLA generates a set of videos, including RGB, semantic segmentation, depth, and edges, used to control Cosmos Transfer. These control videos are generated by using the carla_cosmos_gen.py script. Cosmos Transfer uses these control videos in combination with a text prompt and some additional tuning parameters to produce new variations of the video.
This integration is presented in the form of a client-server architecture. The cosmos_client.py script sends queries to a Cosmos Transfer server, and the server is responsible for completing the request sending videos back to the client (This process takes 1-2 minutes). To this end, users need to first deploy a Cosmos Transfer service. In the next section, we explain the different options to deploy your own Cosmos Transfer service.
Deploying a Cosmos Transfer server
Cosmos Transfer requires high-performance datacenter GPUs such as the NVIDIA H100. We have created multiple ways to deploy Cosmos Transfer servers easily.
Option 1: 1-click deployment on NVIDIA Brev
The CARLA team has created a Brev launchable to enable users to create their own Cosmos Transfer servers with ease. To this end:
1. Sign up to NVIDIA Brev here. Follow the instructions on the website to fund your account.
2. Go to the link carla-x-cosmos-transfer1-lambda. Click on Deploy Launchable.

3. Click on Go to Instance Page.

5. Wait until the instance has started.
6. Make the URL public.
Click on Edit Access (take a note of the Shareable URL and port):

Toggle Make Public:

7. Your Cosmos Transfer service is ready!
Note
You may encounter limited amount of GPU instances in a particular cloud vendor. In those cases try changing to a different provider as follows:

Option 2: Deploying your own service somewhere else
If you have access to the appropriate hardware, you can deploy Cosmos Transfer in a Docker container. The files required to build the server image can be found in the PythonAPI/examples/nvidia/cosmos/server directory inside the root folder of your CARLA installation or package. Follow these steps to build and deploy a Cosmos Transfer server:
1. Install Docker: If Docker is not already installed on your system, install it.
2. Install Conda: Follow these instructions to install Conda.
3. Build the server
Open a terminal inside
PythonAPI/examples/nvidia/cosmos/serverand run themake_docker.shscript:
./make_docker.sh
This step is likely to take 1-2 hours.
4. Deploy
Deploy your docker image in your favorite environment. We recommend a cluster with at least 8 x H100 GPUs. A single H100 GPU should be enough for lower workloads. Run the docker image with the following command:
docker run -d --shm-size 96g --gpus=all --ipc=host -p 8080:8080 cosmos-transfer1-carla
5. Make requests with the client
Once you have deployed your Cosmos server, you can make requests to it using the
cosmos_client.pyscript, providing the appropriate IP address and port for theendpointargument.For example, for a locally deployed server on port 8080:
python cosmos_client.py http://localhost:8080 example_data/prompts/rain.toml
Using the CARLA x Cosmos Transfer Client
In order to start generating videos, you will need to install the dependencies for the Cosmos Transfer client. The Cosmos Transfer generation process involves two steps:
1. Generating control videos for Cosmos Transfer with CARLA
This step generates several control videos for Cosmos Transfer from a CARLA simulation log file. The videos generated may include the following:
- RGB
- Depth
- Edges
- Semantic segmentation
- Instance segmentation
- Sky mask
2. Generating style-transfer variations using Cosmos Transfer
This step generates style-transfer variations of the original control videos using the Cosmos Transfer1 model. The control videos created by CARLA are sent through to a server running the Cosmos Transfer1 model. Requests are sent to the server through a simple API using the
cosmos_client.pyscript.The Cosmos Transfer1 model can be controlled using the prompt and several other control parameters defined in a TOML file. You can find numerous examples of Cosmos Transfer1 configurations inside the
client/example_data/promptsdirectory.
Install dependencies
1. Download CARLA 0.9.16 or the latest nightly build here
2. Once downloaded, uncompress the archive:
tar -xzvf CARLA_0.9.16.tar.gz
3. Install conda, see these instructions. You can skip this step if conda is already installed on your system.
4. Create a conda environment (e.g., carla-cosmos-client) and install all dependencies:
cd PythonAPI/examples/nvidia/cosmos
# Create the carla-cosmos-client conda environment.
conda env create --file client/carla-cosmos-client.yaml
# Activate the carla-cosmos-client conda environment.
conda activate carla-cosmos-client
# Install the dependencies.
pip install -r requirements.txt
pip install -r client/requirements_client.txt
5. To install CARLA Python Client navigate to the PythonAPI/carla/dist folder within the CARLA installation. Locate the .whl file corresponding to your Python version (e.g., carla-0.9.16-cp310-cp310-linux_x86_64.whl):
# Replace carla-0.9.16-cp310-cp310-linux_x86_64.whl with your actual filename
pip install carla-0.9.16-cp310-cp310-linux_x86_64.whl
Generating Cosmos-Transfer Control Inputs
1. Start CARLA
Navigate to the root folder of your CARLA installation and execute the launch script:
./CarlaUE4.sh
2. Generate control videos from a CARLA log
If you want to generate new control inputs using an example, you can run the
PythonAPI/examples/nvidia/cosmos/client/carla_cosmos_gen.pywith the example named iai_carla_synthetic_log_1731622446_actorPOV4641_startTime3.7s_log.log in thePythonAPI/examples/nvidia/cosmos/client/example_data/logs/inverted_ai/directory. You will find several other example log files in the same directory to experiment with.A typical invocation will look like this:
cd PythonAPI/examples/nvidia/cosmos/client
# Replace /full_path_to_log/your_log.log with the absolute path to your log file and output_path with your path to store the results (can be a relative path)
python carla_cosmos_gen.py -f full_path_to_log/your_log.log --sensors cosmos_aov.yaml --class-filter-config filter_semantic_classes.yaml -c ego_sim_id -s 0.0 -d 5.0 -o output_path
The
ego_sim_idvalue is the actor ID of the ego vehicle, for the Cosmos generation script to identify it. If you are recording your own scenarios, remember to note the Ego vehicle's actor ID from theidattribute.For example:
python carla_cosmos_gen.py -f ${PWD}/example_data/logs/inverted_ai/iai_carla_synthetic_log_1731622446_actorPOV4641_startTime3.7s_log.log \
--sensors cosmos_aov.yaml \
--class-filter-config filter_semantic_classes.yaml \
-c 4641 -s 0.0 -d 5.0 -o output
Note: For the example log, replace
ego_sim_idwith 4641.This will produce a set of videos that will be later used to control Cosmos Transfer.
Generating style-transfer variations using Cosmos Transfer
Once you have a set of artifacts and the Cosmos Transfer service has been deployed and is active, use the cosmos_client.py script to make requests. The following command will generate Cosmos Transfer style-transfer videos for the prompt and parameters found in the rain.toml example configuration file found in client/example_data/prompts.
cd PythonAPI/examples/nvidia/cosmos/client
# Replace https://url_to_server with the URL to your CARLA-Cosmos-Transfer server
python cosmos_client.py http://url_to_server:port example_data/prompts/rain.toml
The first argument is the URL and port of the Cosmos Transfer server. If you are running a local server from a Docker container, this may be localhost or the network IP address of your server. If you are using NVIDIA Brev, the appropriate URL and port are given in the details of the Brev instance.
You can edit the text prompt and control parameters in the rain.toml configuration to experiment and see the effects. Numerous other examples are provided in the same directory and you can learn more about the parameters in the following section.
You can optionally override some fields from the TOML on the command line and choose where to save the output:
python cosmos_client.py http://url_to_server:port \
example_data/prompts/rain.toml \
--output outputs/ \
--input-video example_data/artifacts/rgb.mp4 \
--edge-video example_data/artifacts/edges.mp4 \
--depth-video example_data/artifacts/depth.mp4 \ # optional
--seg-video example_data/artifacts/semantic_segmentation.mp4 \
--vis-video example_data/artifacts/vis_control.mp4 \ # optional
--seed 2048
Understanding the Cosmos-Transfer Configuration
This section describes the TOML configuration (see example_data/prompts/rain.toml). The client accepts a flat schema as shown below, and also supports the same keys nested under a top-level controlnet_specs table.
Required fields
| Field | Type | Description |
|---|---|---|
prompt |
string | Text describing the desired scene |
input_video_path |
string | Path to the input video file |
negative_prompt |
string | Text describing what to avoid in the output |
num_steps |
int | Number of diffusion steps |
guidance |
float | CFG guidance scale |
sigma_max |
float | Partial noise added to input; [0, 80]. >= 80 ignores the input video |
seed |
int | Random seed for reproducibility |
Optional scalar fields
| Field | Type | Description |
|---|---|---|
blur_strength |
string | Blur strength for preparing the vis control input. One of: very_low, low, medium, high, very_high |
canny_threshold |
string | Optional threshold preset used when generating edges externally |
Controls (all optional)
If present, controls must follow these rules:
| Control | Required keys | Notes |
|---|---|---|
edge |
input_control (string)control_weight (number) |
Path typically points to an edges video |
depth |
input_control (string)control_weight (number) |
Path points to a depth video |
seg |
input_control (string)control_weight (number) |
Path points to a semantic segmentation video |
vis |
control_weight (number) |
input_control is optional |
Validation constraints
The client validates the following aspects before sending the data:
1. All required scalar fields above must be present and of the correct type.
2. Controls are optional. If edge, depth, or seg are provided, both control_weight and input_control must be present. If vis is provided, control_weight must be present and input_control is optional.
Example TOMLs are provided in client/example_data/prompts/.
Command-line arguments
cosmos_client.py accepts the following arguments:
| Argument | Type | Description |
|---|---|---|
endpoint |
string | Base URL of the server (e.g., http://localhost:8080) |
config_toml |
string | Path to the TOML configuration |
--output |
string | File or directory path to save the result video |
--input-video |
string | Override input_video_path from the TOML |
--edge-video |
string | Override edge.input_control |
--depth-video |
string | Override depth.input_control |
--seg-video |
string | Override seg.input_control |
--vis-video |
string | Override vis.input_control |
--seed |
int | Override seed |
--retries |
int | Max retries for uploads and generation (default: 3) |
--backoff-initial |
float | Initial backoff seconds (default: 1.5) |
--backoff-multiplier |
float | Exponential backoff multiplier (default: 2.0) |
--jitter |
float | Random jitter added to backoff (default: 0.5) |
--poll-interval |
int | Poll interval in seconds for job status (default: 5) |
--result-timeout |
int | Timeout in seconds when fetching job results (default: 120) |