Label Studio¶
Operating System:
Terminal:
Shell:
Editor:
Package Manager:
Programming Language:
Utility:
Extension:
Operating System:
Terminal:
Shell:
Editor:
Package Manager:
Programming Language:
Utility:
Extension:
Operating System:
Terminal:
Shell:
Editor:
Package Manager:
Programming Language:
Utility:
Extension:
Label Studio is an open-source data labeling tool that can handle different data types, such as audio, text, images, and HTML. It allows users to label, annotate, and explore their data. It also provides a machine learning interface that supports various training techniques, such as active learning and supervised learning.
For more information, check here.
Account Information¶
Upon startup, the Label Studio web interface will prompt you to sign up or create a new user account.
Sign up¶
You will need to create an account when starting a new Label Studio project. For this, import an empty folder from your UCloud workspace using the mandatory --data-dir
parameter.
When signing up, it is required that the username is an email address and that the password is at least 8 characters long. There is no email verification.
The login credentials can also be specified before submitting the job via the --username
and --password
optional parameters.
Note
The sign up page is disabled when the app is started with a public link.
Log in¶
You can login with the user credentials created in a previous session.
Database Setup¶
Label Studio uses SQLite by default. You don’t need to configure anything. Label Studio stores all data in a single file inside the directory imported using the --data-dir
option.
For more information, check here.
Import & Export Files¶
Files for labeling (e.g. images) can be imported using the Label Studio user interface. By default, files are uploaded directly from the user's computer.
To add files stored on a UCloud drive, import the correspoding directory, e.g. mydataset
, within the application using the optional Select folder to use parameter.
In Label Studio, after creating a project, go to the project settings, and under the Cloud Storage tab, select Add Source Storage or Add Target Storage.
Then, select Local files as Storage Type and insert the Absolute local path, e.g. /work/mydataset
. Finally, click on Check Connection and press Add Storage to import your data inside the project.
If there are files other than JSON files, activate the switch Treat every bucket object as a source file. In the File Filter Regex field, you can specify a regular expression to filter bucket objects.
Use .*
to collect all objects. After adding the storage, click on Sync Storage to collect tasks from the bucket.
For more information, check here.
During export the data will be downloaded via the browser in a specified format. Additionally, a log entry is created in the export
subfolder within the input directory.
ML backend¶
Label Studio ML backend is an SDK that allows users to start a server to integrate Label Studio with machine learning models.
Initialize the server¶
To initialize the server, open a terminal window inside the app and run the command:
$ label-studio-ml init --root-dir /work my_ml_backend
This will create a new directory /work/my_ml_backend
with templates to define your ML backend model:
my_ml_backend/
├── Dockerfile
├── README.md
├── __pycache__
│ └── model.cpython-310.pyc
├── _wsgi.py
├── docker-compose.yml
├── model.py
└── requirements.txt
The model dependencies should be added to the requirements.txt
file.
Note
These dependencies can be installed automatically before launching the Label Studio app by importing the requirements.txt
file via the Initialization optional parameter.
For machine learning example tutorials check here.
Start the server¶
To start the ML backend server in backgorund, change to the ML backend directory and run the command:
$ gunicorn _wsgi:app --bind "0.0.0.0:9090" --workers 2 &
Load the model¶
Open the project settings in Label Studio and go to Machine Learning. Then, click on Add Model, specify the URL, e.g. http://0.0.0.0:9090
, and press Validate and Save.
For more information on how to integrate Label Studio into your machine learning pipeline, check here.
Contents