Jobs¶
Jobs in UCloud are the core abstraction used to describe units of computation.
Rationale¶
📝 NOTE: This API follows the standard Resources API. We recommend that you have already read and understood the concepts described here.
The compute system allows for a variety of computational workloads to run on UCloud. All compute jobs in UCloud run an application on one or more ‘nodes’. The type of applications determine what the job does:
Batch applications provide support for long running computational workloads (typically containerized)
Web applications provide support for applications which expose a graphical web-interface
VNC applications provide support for interactive remote desktop workloads
Virtual machine applications provide support for more advanced workloads which aren’t easily containerized or require special privileges
Every Job
is created from a specification
. The specification
contains input parameters, such as files
and application flags, and additional resources. Zero or more resources can be connected to an application,
and provide services such as:
Networking between multiple
Job
s
📝 Provider Note: This is the API exposed to end-users. See the table below for other relevant APIs.
End-User | Provider (Ingoing) | Control (Outgoing) |
---|---|---|
Jobs |
JobsProvider |
JobsControl |
Table of Contents¶
1. Examples
2. Remote Procedure Calls
Name | Description |
---|---|
browse |
Browses the catalog of all Jobs |
follow |
Follow the progress of a job |
retrieve |
Retrieves a single Job |
retrieveProducts |
Retrieve product support for all accessible providers |
retrieveUtilization |
Retrieve information about how busy the provider's cluster currently is |
search |
Searches the catalog of available resources |
create |
Creates one or more resources |
extend |
Extend the duration of one or more jobs |
init |
Request (potential) initialization of resources |
openInteractiveSession |
Opens an interactive session (e.g. terminal, web or VNC) |
suspend |
Suspend a job |
terminate |
Request job cancellation and destruction |
unsuspend |
Unsuspends a job |
updateAcl |
Updates the ACL attached to a resource |
3. Data Models
Name | Description |
---|---|
Job |
A `Job` in UCloud is the core abstraction used to describe a unit of computation. |
JobSpecification |
A specification of a Job |
JobState |
A value describing the current state of a Job |
InteractiveSessionType |
A value describing a type of 'interactive' session |
JobIncludeFlags |
Flags used to tweak read operations of Jobs |
ComputeSupport |
No description |
ComputeSupport.Docker |
No description |
ComputeSupport.Native |
No description |
ComputeSupport.VirtualMachine |
No description |
CpuAndMemory |
No description |
ExportedParameters |
No description |
ExportedParameters.Resources |
No description |
JobBindKind |
No description |
JobBinding |
No description |
JobOutput |
No description |
JobStatus |
Describes the current state of the `Resource` |
JobUpdate |
Describes an update to the `Resource` |
JobsLog |
No description |
OpenSession |
No description |
OpenSession.Shell |
No description |
OpenSession.Vnc |
No description |
OpenSession.Web |
No description |
OpenSessionWithProvider |
No description |
QueueStatus |
No description |
ExportedParametersRequest |
No description |
JobsExtendRequestItem |
No description |
JobsOpenInteractiveSessionRequestItem |
No description |
JobsRetrieveUtilizationRequest |
No description |
JobsFollowResponse |
No description |
JobsRetrieveUtilizationResponse |
No description |
Example: Creating a simple batch Job¶
Frequency of use | Common |
---|---|
Trigger | User initiated |
Pre-conditions |
|
Post-conditions |
|
Actors |
|
Communication Flow: Kotlin
/* The user finds an interesting application from the catalog */
/* The user selects the first application ('batch' in version '1.0.0') */
/* The user requests additional information about the application */
val application = AppStore.findByNameAndVersion.call(
FindByNameAndVersionRequest(
appName = "a-batch-application",
appVersion = "1.0.0",
),
user
).orThrow()
/*
application = ApplicationWithFavoriteAndTags(
favorite = false,
invocation = ApplicationInvocationDescription(
allowAdditionalMounts = null,
allowAdditionalPeers = null,
allowMultiNode = false,
allowPublicIp = false,
allowPublicLink = null,
applicationType = ApplicationType.BATCH,
container = null,
environment = null,
fileExtensions = emptyList(),
invocation = listOf(WordInvocationParameter(
word = "batch",
), VariableInvocationParameter(
isPrefixVariablePartOfArg = false,
isSuffixVariablePartOfArg = false,
prefixGlobal = "",
prefixVariable = "",
suffixGlobal = "",
suffixVariable = "",
variableNames = listOf("var"),
)),
licenseServers = emptyList(),
modules = null,
outputFileGlobs = listOf("*"),
parameters = listOf(ApplicationParameter.Text(
defaultValue = null,
description = "An example input variable",
name = "var",
optional = false,
title = "",
)),
shouldAllowAdditionalMounts = false,
shouldAllowAdditionalPeers = true,
ssh = null,
tool = ToolReference(
name = "batch-tool",
tool = Tool(
createdAt = 1632979836013,
description = NormalizedToolDescription(
authors = listOf("UCloud"),
backend = ToolBackend.DOCKER,
container = null,
defaultNumberOfNodes = 1,
defaultTimeAllocation = SimpleDuration(
hours = 1,
minutes = 0,
seconds = 0,
),
description = "Batch tool",
image = "dreg.cloud.sdu.dk/batch/batch:1.0.0",
info = NameAndVersion(
name = "batch-tool",
version = "1.0.0",
),
license = "None",
requiredModules = emptyList(),
supportedProviders = null,
title = "Batch tool",
),
modifiedAt = 1632979836013,
owner = "user",
),
version = "1.0.0",
),
vnc = null,
web = null,
),
metadata = ApplicationMetadata(
authors = listOf("UCloud"),
createdAt = 1717663228434,
description = "This is a batch application",
flavorName = null,
group = null,
isPublic = true,
name = "a-batch-application",
public = true,
title = "A Batch Application",
version = "1.0.0",
website = null,
),
tags = listOf("very-scientific"),
)
*/
/* The user looks for a suitable machine */
val machineTypes = Products.browse.call(
ProductsBrowseRequest(
consistency = null,
filterArea = ProductType.COMPUTE,
filterCategory = null,
filterName = null,
filterProvider = null,
filterVersion = null,
includeBalance = null,
includeMaxBalance = null,
itemsPerPage = 50,
itemsToSkip = null,
next = null,
showAllVersions = null,
),
user
).orThrow()
/*
machineTypes = PageV2(
items = listOf(Product.Compute(
allowAllocationRequestsFrom = AllocationRequestsGroup.ALL,
category = ProductCategoryId(
id = "example-compute",
name = "example-compute",
provider = "example",
),
chargeType = ChargeType.ABSOLUTE,
cpu = 10,
cpuModel = null,
description = "An example compute product",
freeToUse = false,
gpu = 0,
gpuModel = null,
hiddenInGrantApplications = false,
memoryInGigs = 20,
memoryModel = null,
name = "example-compute",
pricePerUnit = 1000000,
priority = 0,
productType = ProductType.COMPUTE,
unitOfPrice = ProductPriceUnit.CREDITS_PER_MINUTE,
version = 1,
balance = null,
id = "example-compute",
maxUsableBalance = null,
)),
itemsPerPage = 50,
next = null,
)
*/
/* The user starts the Job with input based on previous requests */
Jobs.create.call(
bulkRequestOf(JobSpecification(
allowDuplicateJob = false,
application = NameAndVersion(
name = "a-batch-application",
version = "1.0.0",
),
name = null,
openedFile = null,
parameters = mapOf("var" to AppParameterValue.Text(
value = "Example",
)),
product = ProductReference(
category = "example-compute",
id = "example-compute",
provider = "example",
),
replicas = 1,
resources = null,
restartOnExit = null,
sshEnabled = null,
timeAllocation = null,
)),
user
).orThrow()
/*
BulkResponse(
responses = listOf(FindByStringId(
id = "48920",
)),
)
*/
Communication Flow: Curl
# ------------------------------------------------------------------------------------------------------
# $host is the UCloud instance to contact. Example: 'http://localhost:8080' or 'https://cloud.sdu.dk'
# $accessToken is a valid access-token issued by UCloud
# ------------------------------------------------------------------------------------------------------
# The user finds an interesting application from the catalog
# The user selects the first application ('batch' in version '1.0.0')
# The user requests additional information about the application
# Authenticated as user
curl -XGET -H "Authorization: Bearer $accessToken" "$host/api/hpc/apps/byNameAndVersion?appName=a-batch-application&appVersion=1.0.0"
# application =
# {
# "metadata": {
# "name": "a-batch-application",
# "version": "1.0.0",
# "authors": [
# "UCloud"
# ],
# "title": "A Batch Application",
# "description": "This is a batch application",
# "website": null,
# "public": true,
# "flavorName": null,
# "group": null,
# "createdAt": 1717663228434
# },
# "invocation": {
# "tool": {
# "name": "batch-tool",
# "version": "1.0.0",
# "tool": {
# "owner": "user",
# "createdAt": 1632979836013,
# "modifiedAt": 1632979836013,
# "description": {
# "info": {
# "name": "batch-tool",
# "version": "1.0.0"
# },
# "container": null,
# "defaultNumberOfNodes": 1,
# "defaultTimeAllocation": {
# "hours": 1,
# "minutes": 0,
# "seconds": 0
# },
# "requiredModules": [
# ],
# "authors": [
# "UCloud"
# ],
# "title": "Batch tool",
# "description": "Batch tool",
# "backend": "DOCKER",
# "license": "None",
# "image": "dreg.cloud.sdu.dk/batch/batch:1.0.0",
# "supportedProviders": null
# }
# }
# },
# "invocation": [
# {
# "type": "word",
# "word": "batch"
# },
# {
# "type": "var",
# "variableNames": [
# "var"
# ],
# "prefixGlobal": "",
# "suffixGlobal": "",
# "prefixVariable": "",
# "suffixVariable": "",
# "isPrefixVariablePartOfArg": false,
# "isSuffixVariablePartOfArg": false
# }
# ],
# "parameters": [
# {
# "type": "text",
# "name": "var",
# "optional": false,
# "defaultValue": null,
# "title": "",
# "description": "An example input variable"
# }
# ],
# "outputFileGlobs": [
# "*"
# ],
# "applicationType": "BATCH",
# "vnc": null,
# "web": null,
# "ssh": null,
# "container": null,
# "environment": null,
# "allowAdditionalMounts": null,
# "allowAdditionalPeers": null,
# "allowMultiNode": false,
# "allowPublicIp": false,
# "allowPublicLink": null,
# "fileExtensions": [
# ],
# "licenseServers": [
# ],
# "modules": null
# },
# "favorite": false,
# "tags": [
# "very-scientific"
# ]
# }
# The user looks for a suitable machine
curl -XGET -H "Authorization: Bearer $accessToken" "$host/api/products/browse?itemsPerPage=50&filterArea=COMPUTE"
# machineTypes =
# {
# "itemsPerPage": 50,
# "items": [
# {
# "type": "compute",
# "balance": null,
# "maxUsableBalance": null,
# "name": "example-compute",
# "pricePerUnit": 1000000,
# "category": {
# "name": "example-compute",
# "provider": "example"
# },
# "description": "An example compute product",
# "priority": 0,
# "cpu": 10,
# "memoryInGigs": 20,
# "gpu": 0,
# "cpuModel": null,
# "memoryModel": null,
# "gpuModel": null,
# "version": 1,
# "freeToUse": false,
# "allowAllocationRequestsFrom": "ALL",
# "unitOfPrice": "CREDITS_PER_MINUTE",
# "chargeType": "ABSOLUTE",
# "hiddenInGrantApplications": false,
# "productType": "COMPUTE"
# }
# ],
# "next": null
# }
# The user starts the Job with input based on previous requests
curl -XPOST -H "Authorization: Bearer $accessToken" -H "Content-Type: content-type: application/json; charset=utf-8" "$host/api/jobs" -d '{
"items": [
{
"application": {
"name": "a-batch-application",
"version": "1.0.0"
},
"product": {
"id": "example-compute",
"category": "example-compute",
"provider": "example"
},
"name": null,
"replicas": 1,
"allowDuplicateJob": false,
"parameters": {
"var": {
"type": "text",
"value": "Example"
}
},
"resources": null,
"timeAllocation": null,
"openedFile": null,
"restartOnExit": null,
"sshEnabled": null
}
]
}'
# {
# "responses": [
# {
# "id": "48920"
# }
# ]
# }
Communication Flow: Visual
Example: Following the progress of a Job¶
Frequency of use | Common |
---|---|
Pre-conditions |
|
Actors |
|
Communication Flow: Kotlin
Jobs.follow.subscribe(
FindByStringId(
id = "123",
),
user,
handler = { /* will receive messages listed below */ }
)
/*
JobsFollowResponse(
log = emptyList(),
newStatus = JobStatus(
allowRestart = false,
expiresAt = null,
jobParametersJson = null,
resolvedApplication = null,
resolvedProduct = null,
resolvedSupport = null,
startedAt = null,
state = JobState.IN_QUEUE,
),
updates = emptyList(),
)
*/
/*
JobsFollowResponse(
log = emptyList(),
newStatus = JobStatus(
allowRestart = false,
expiresAt = null,
jobParametersJson = null,
resolvedApplication = null,
resolvedProduct = null,
resolvedSupport = null,
startedAt = null,
state = JobState.RUNNING,
),
updates = listOf(JobUpdate(
allowRestart = null,
expectedDifferentState = null,
expectedState = null,
newMounts = null,
newTimeAllocation = null,
outputFolder = null,
state = JobState.RUNNING,
status = "The job is now running",
timestamp = 1633680152778,
)),
)
*/
/*
JobsFollowResponse(
log = listOf(JobsLog(
rank = 0,
stderr = null,
stdout = "GNU bash, version 5.0.17(1)-release (x86_64-pc-linux-gnu)" + "\n" +
"Copyright (C) 2019 Free Software Foundation, Inc." + "\n" +
"License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>" + "\n" +
"" + "\n" +
"This is free software; you are free to change and redistribute it." + "\n" +
"There is NO WARRANTY, to the extent permitted by law.",
)),
newStatus = JobStatus(
allowRestart = false,
expiresAt = null,
jobParametersJson = null,
resolvedApplication = null,
resolvedProduct = null,
resolvedSupport = null,
startedAt = null,
state = JobState.RUNNING,
),
updates = emptyList(),
)
*/
/*
JobsFollowResponse(
log = emptyList(),
newStatus = JobStatus(
allowRestart = false,
expiresAt = null,
jobParametersJson = null,
resolvedApplication = null,
resolvedProduct = null,
resolvedSupport = null,
startedAt = null,
state = JobState.SUCCESS,
),
updates = listOf(JobUpdate(
allowRestart = null,
expectedDifferentState = null,
expectedState = null,
newMounts = null,
newTimeAllocation = null,
outputFolder = null,
state = JobState.SUCCESS,
status = "The job is no longer running",
timestamp = 1633680152778,
)),
)
*/
Communication Flow: Curl
# ------------------------------------------------------------------------------------------------------
# $host is the UCloud instance to contact. Example: 'http://localhost:8080' or 'https://cloud.sdu.dk'
# $accessToken is a valid access-token issued by UCloud
# ------------------------------------------------------------------------------------------------------
Communication Flow: Visual
Example: Starting an interactive terminal session¶
Frequency of use | Common |
---|---|
Trigger | User initiated by clicking on 'Open Terminal' of a running Job |
Pre-conditions |
|
Actors |
|
Communication Flow: Kotlin
Jobs.retrieveProducts.call(
Unit,
user
).orThrow()
/*
SupportByProvider(
productsByProvider = mapOf("example" to listOf(ResolvedSupport(
product = Product.Compute(
allowAllocationRequestsFrom = AllocationRequestsGroup.ALL,
category = ProductCategoryId(
id = "compute-example",
name = "compute-example",
provider = "example",
),
chargeType = ChargeType.ABSOLUTE,
cpu = 1,
cpuModel = null,
description = "An example machine",
freeToUse = false,
gpu = 0,
gpuModel = null,
hiddenInGrantApplications = false,
memoryInGigs = 2,
memoryModel = null,
name = "compute-example",
pricePerUnit = 1000000,
priority = 0,
productType = ProductType.COMPUTE,
unitOfPrice = ProductPriceUnit.CREDITS_PER_MINUTE,
version = 1,
balance = null,
id = "compute-example",
maxUsableBalance = null,
),
support = ComputeSupport(
docker = ComputeSupport.Docker(
enabled = true,
logs = null,
peers = null,
terminal = true,
timeExtension = null,
utilization = null,
vnc = null,
web = null,
),
maintenance = null,
native = ComputeSupport.Native(
enabled = null,
logs = null,
terminal = null,
timeExtension = null,
utilization = null,
vnc = null,
web = null,
),
product = ProductReference(
category = "compute-example",
id = "compute-example",
provider = "example",
),
virtualMachine = ComputeSupport.VirtualMachine(
enabled = null,
logs = null,
suspension = null,
terminal = null,
timeExtension = null,
utilization = null,
vnc = null,
),
),
))),
)
*/
/* 📝 Note: The machine has support for the 'terminal' feature */
Jobs.openInteractiveSession.call(
bulkRequestOf(JobsOpenInteractiveSessionRequestItem(
id = "123",
rank = 1,
sessionType = InteractiveSessionType.SHELL,
)),
user
).orThrow()
/*
BulkResponse(
responses = listOf(OpenSessionWithProvider(
providerDomain = "provider.example.com",
providerId = "example",
session = OpenSession.Shell(
domainOverride = null,
jobId = "123",
rank = 1,
sessionIdentifier = "a81ea644-58f5-44d9-8e94-89f81666c441",
),
)),
)
*/
/* The session is now open and we can establish a shell connection directly with provider.example.com */
Shells.open.subscribe(
ShellRequest.Initialize(
cols = 80,
rows = 24,
sessionIdentifier = "a81ea644-58f5-44d9-8e94-89f81666c441",
),
user,
handler = { /* will receive messages listed below */ }
)
/*
ShellResponse.Data(
data = "user@machine:~$ ",
)
*/
Shells.open.call(
ShellRequest.Input(
data = "ls -1" + "\n" +
"",
),
user
).orThrow()
/*
ShellResponse.Data(
data = "ls -1" + "\n" +
"",
)
*/
/*
ShellResponse.Data(
data = "hello_world.txt" + "\n" +
"",
)
*/
/*
ShellResponse.Data(
data = "user@machine:~$ ",
)
*/
Communication Flow: Curl
# ------------------------------------------------------------------------------------------------------
# $host is the UCloud instance to contact. Example: 'http://localhost:8080' or 'https://cloud.sdu.dk'
# $accessToken is a valid access-token issued by UCloud
# ------------------------------------------------------------------------------------------------------
# Authenticated as user
curl -XGET -H "Authorization: Bearer $accessToken" "$host/api/jobs/retrieveProducts"
# {
# "productsByProvider": {
# "example": [
# {
# "product": {
# "balance": null,
# "maxUsableBalance": null,
# "name": "compute-example",
# "pricePerUnit": 1000000,
# "category": {
# "name": "compute-example",
# "provider": "example"
# },
# "description": "An example machine",
# "priority": 0,
# "cpu": 1,
# "memoryInGigs": 2,
# "gpu": 0,
# "cpuModel": null,
# "memoryModel": null,
# "gpuModel": null,
# "version": 1,
# "freeToUse": false,
# "allowAllocationRequestsFrom": "ALL",
# "unitOfPrice": "CREDITS_PER_MINUTE",
# "chargeType": "ABSOLUTE",
# "hiddenInGrantApplications": false,
# "productType": "COMPUTE"
# },
# "support": {
# "product": {
# "id": "compute-example",
# "category": "compute-example",
# "provider": "example"
# },
# "docker": {
# "enabled": true,
# "web": null,
# "vnc": null,
# "logs": null,
# "terminal": true,
# "peers": null,
# "timeExtension": null,
# "utilization": null
# },
# "virtualMachine": {
# "enabled": null,
# "logs": null,
# "vnc": null,
# "terminal": null,
# "timeExtension": null,
# "suspension": null,
# "utilization": null
# },
# "native": {
# "enabled": null,
# "logs": null,
# "vnc": null,
# "terminal": null,
# "timeExtension": null,
# "utilization": null,
# "web": null
# },
# "maintenance": null
# }
# }
# ]
# }
# }
# 📝 Note: The machine has support for the 'terminal' feature
curl -XPOST -H "Authorization: Bearer $accessToken" -H "Content-Type: content-type: application/json; charset=utf-8" "$host/api/jobs/interactiveSession" -d '{
"items": [
{
"id": "123",
"rank": 1,
"sessionType": "SHELL"
}
]
}'
# {
# "responses": [
# {
# "providerDomain": "provider.example.com",
# "providerId": "example",
# "session": {
# "type": "shell",
# "jobId": "123",
# "rank": 1,
# "sessionIdentifier": "a81ea644-58f5-44d9-8e94-89f81666c441",
# "domainOverride": null
# }
# }
# ]
# }
# The session is now open and we can establish a shell connection directly with provider.example.com
Communication Flow: Visual
Example: Connecting two Jobs together¶
Frequency of use | Common |
---|---|
Trigger | User initiated |
Actors |
|
Communication Flow: Kotlin
/* In this example our user wish to deploy a simple web application which connects to a database server */
/* The user first provision a database server using an Application */
Jobs.create.call(
bulkRequestOf(JobSpecification(
allowDuplicateJob = false,
application = NameAndVersion(
name = "acme-database",
version = "1.0.0",
),
name = "my-database",
openedFile = null,
parameters = mapOf("dataStore" to AppParameterValue.File(
path = "/123/acme-database",
readOnly = false,
)),
product = ProductReference(
category = "example-compute",
id = "example-compute",
provider = "example",
),
replicas = 1,
resources = null,
restartOnExit = null,
sshEnabled = null,
timeAllocation = null,
)),
user
).orThrow()
/*
BulkResponse(
responses = listOf(FindByStringId(
id = "4101",
)),
)
*/
/* The database is now `RUNNING` with the persistent from `/123/acme-database` */
/* By default, the UCloud firewall will not allow any ingoing connections to the Job. This firewall
can be updated by connecting one or more Jobs together. We will now do this using the Application.
"Peer" feature. This feature is commonly referred to as "Connect to Job". */
/* We will now start our web-application and connect it to our existing database Job */
Jobs.create.call(
bulkRequestOf(JobSpecification(
allowDuplicateJob = false,
application = NameAndVersion(
name = "acme-web-app",
version = "1.0.0",
),
name = "my-web-app",
openedFile = null,
parameters = null,
product = ProductReference(
category = "example-compute",
id = "example-compute",
provider = "example",
),
replicas = 1,
resources = listOf(AppParameterValue.Peer(
hostname = "database",
jobId = "4101",
)),
restartOnExit = null,
sshEnabled = null,
timeAllocation = null,
)),
user
).orThrow()
/*
BulkResponse(
responses = listOf(FindByStringId(
id = "4150",
)),
)
*/
/* The web-application can now connect to the database using the 'database' hostname, as specified in
the JobSpecification. */
Communication Flow: Curl
# ------------------------------------------------------------------------------------------------------
# $host is the UCloud instance to contact. Example: 'http://localhost:8080' or 'https://cloud.sdu.dk'
# $accessToken is a valid access-token issued by UCloud
# ------------------------------------------------------------------------------------------------------
# In this example our user wish to deploy a simple web application which connects to a database server
# The user first provision a database server using an Application
# Authenticated as user
curl -XPOST -H "Authorization: Bearer $accessToken" -H "Content-Type: content-type: application/json; charset=utf-8" "$host/api/jobs" -d '{
"items": [
{
"application": {
"name": "acme-database",
"version": "1.0.0"
},
"product": {
"id": "example-compute",
"category": "example-compute",
"provider": "example"
},
"name": "my-database",
"replicas": 1,
"allowDuplicateJob": false,
"parameters": {
"dataStore": {
"type": "file",
"path": "/123/acme-database",
"readOnly": false
}
},
"resources": null,
"timeAllocation": null,
"openedFile": null,
"restartOnExit": null,
"sshEnabled": null
}
]
}'
# {
# "responses": [
# {
# "id": "4101"
# }
# ]
# }
# The database is now `RUNNING` with the persistent from `/123/acme-database`
# By default, the UCloud firewall will not allow any ingoing connections to the Job. This firewall
# can be updated by connecting one or more Jobs together. We will now do this using the Application.
# "Peer" feature. This feature is commonly referred to as "Connect to Job".
# We will now start our web-application and connect it to our existing database Job
curl -XPOST -H "Authorization: Bearer $accessToken" -H "Content-Type: content-type: application/json; charset=utf-8" "$host/api/jobs" -d '{
"items": [
{
"application": {
"name": "acme-web-app",
"version": "1.0.0"
},
"product": {
"id": "example-compute",
"category": "example-compute",
"provider": "example"
},
"name": "my-web-app",
"replicas": 1,
"allowDuplicateJob": false,
"parameters": null,
"resources": [
{
"type": "peer",
"hostname": "database",
"jobId": "4101"
}
],
"timeAllocation": null,
"openedFile": null,
"restartOnExit": null,
"sshEnabled": null
}
]
}'
# {
# "responses": [
# {
# "id": "4150"
# }
# ]
# }
# The web-application can now connect to the database using the 'database' hostname, as specified in
# the JobSpecification.
Communication Flow: Visual
Example: Starting a Job with a public link (Ingress)¶
Frequency of use | Common |
---|---|
Actors |
|
Communication Flow: Kotlin
/* In this example, the user will create a Job which exposes a web-interface. This web-interface will
become available through a publicly accessible link. */
/* First, the user creates an Ingress resource (this needs to be done once per ingress) */
Ingresses.create.call(
bulkRequestOf(IngressSpecification(
domain = "app-my-application.provider.example.com",
product = ProductReference(
category = "example-ingress",
id = "example-ingress",
provider = "example",
),
)),
user
).orThrow()
/*
BulkResponse(
responses = listOf(FindByStringId(
id = "41231",
)),
)
*/
/* This link can now be attached to any Application which support a web-interface */
Jobs.create.call(
bulkRequestOf(JobSpecification(
allowDuplicateJob = false,
application = NameAndVersion(
name = "acme-web-app",
version = "1.0.0",
),
name = null,
openedFile = null,
parameters = null,
product = ProductReference(
category = "compute-example",
id = "compute-example",
provider = "example",
),
replicas = 1,
resources = listOf(AppParameterValue.Ingress(
id = "41231",
)),
restartOnExit = null,
sshEnabled = null,
timeAllocation = null,
)),
user
).orThrow()
/*
BulkResponse(
responses = listOf(FindByStringId(
id = "41252",
)),
)
*/
/* The Application is now running, and we can access it through the public link */
/* The Ingress will also remain exclusively bound to the Job. It will remain like this until the Job
terminates. You can check the status of the Ingress simply by retrieving it. */
Ingresses.retrieve.call(
ResourceRetrieveRequest(
flags = IngressIncludeFlags(
filterCreatedAfter = null,
filterCreatedBefore = null,
filterCreatedBy = null,
filterIds = null,
filterProductCategory = null,
filterProductId = null,
filterProvider = null,
filterProviderIds = null,
filterState = null,
hideProductCategory = null,
hideProductId = null,
hideProvider = null,
includeOthers = false,
includeProduct = false,
includeSupport = false,
includeUpdates = false,
),
id = "41231",
),
user
).orThrow()
/*
Ingress(
createdAt = 1633087693694,
id = "41231",
owner = ResourceOwner(
createdBy = "user",
project = null,
),
permissions = null,
specification = IngressSpecification(
domain = "app-my-application.provider.example.com",
product = ProductReference(
category = "example-ingress",
id = "example-ingress",
provider = "example",
),
),
status = IngressStatus(
boundTo = listOf("41231"),
resolvedProduct = null,
resolvedSupport = null,
state = IngressState.READY,
),
updates = emptyList(),
providerGeneratedId = "41231",
)
*/
Communication Flow: Curl
# ------------------------------------------------------------------------------------------------------
# $host is the UCloud instance to contact. Example: 'http://localhost:8080' or 'https://cloud.sdu.dk'
# $accessToken is a valid access-token issued by UCloud
# ------------------------------------------------------------------------------------------------------
# In this example, the user will create a Job which exposes a web-interface. This web-interface will
# become available through a publicly accessible link.
# First, the user creates an Ingress resource (this needs to be done once per ingress)
# Authenticated as user
curl -XPOST -H "Authorization: Bearer $accessToken" -H "Content-Type: content-type: application/json; charset=utf-8" "$host/api/ingresses" -d '{
"items": [
{
"domain": "app-my-application.provider.example.com",
"product": {
"id": "example-ingress",
"category": "example-ingress",
"provider": "example"
}
}
]
}'
# {
# "responses": [
# {
# "id": "41231"
# }
# ]
# }
# This link can now be attached to any Application which support a web-interface
curl -XPOST -H "Authorization: Bearer $accessToken" -H "Content-Type: content-type: application/json; charset=utf-8" "$host/api/jobs" -d '{
"items": [
{
"application": {
"name": "acme-web-app",
"version": "1.0.0"
},
"product": {
"id": "compute-example",
"category": "compute-example",
"provider": "example"
},
"name": null,
"replicas": 1,
"allowDuplicateJob": false,
"parameters": null,
"resources": [
{
"type": "ingress",
"id": "41231"
}
],
"timeAllocation": null,
"openedFile": null,
"restartOnExit": null,
"sshEnabled": null
}
]
}'
# {
# "responses": [
# {
# "id": "41252"
# }
# ]
# }
# The Application is now running, and we can access it through the public link
# The Ingress will also remain exclusively bound to the Job. It will remain like this until the Job
# terminates. You can check the status of the Ingress simply by retrieving it.
curl -XGET -H "Authorization: Bearer $accessToken" "$host/api/ingresses/retrieve?includeOthers=false&includeUpdates=false&includeSupport=false&includeProduct=false&id=41231"
# {
# "id": "41231",
# "specification": {
# "domain": "app-my-application.provider.example.com",
# "product": {
# "id": "example-ingress",
# "category": "example-ingress",
# "provider": "example"
# }
# },
# "owner": {
# "createdBy": "user",
# "project": null
# },
# "createdAt": 1633087693694,
# "status": {
# "boundTo": [
# "41231"
# ],
# "state": "READY",
# "resolvedSupport": null,
# "resolvedProduct": null
# },
# "updates": [
# ],
# "permissions": null
# }
Communication Flow: Visual
Example: Using licensed software¶
Frequency of use | Common |
---|---|
Pre-conditions |
|
Actors |
|
Communication Flow: Kotlin
/* In this example, the user will run a piece of licensed software. */
/* First, the user must activate a copy of their license, which has previously been granted to them through the Grant system. */
Licenses.create.call(
bulkRequestOf(LicenseSpecification(
product = ProductReference(
category = "example-license",
id = "example-license",
provider = "example",
),
)),
user
).orThrow()
/*
BulkResponse(
responses = listOf(FindByStringId(
id = "56231",
)),
)
*/
/* This license can now freely be used in Jobs */
Jobs.create.call(
bulkRequestOf(JobSpecification(
allowDuplicateJob = false,
application = NameAndVersion(
name = "acme-licensed-software",
version = "1.0.0",
),
name = null,
openedFile = null,
parameters = mapOf("license" to AppParameterValue.License(
id = "56231",
)),
product = ProductReference(
category = "example-compute",
id = "example-compute",
provider = "example",
),
replicas = 1,
resources = null,
restartOnExit = null,
sshEnabled = null,
timeAllocation = null,
)),
user
).orThrow()
/*
BulkResponse(
responses = listOf(FindByStringId(
id = "55123",
)),
)
*/
Communication Flow: Curl
# ------------------------------------------------------------------------------------------------------
# $host is the UCloud instance to contact. Example: 'http://localhost:8080' or 'https://cloud.sdu.dk'
# $accessToken is a valid access-token issued by UCloud
# ------------------------------------------------------------------------------------------------------
# In this example, the user will run a piece of licensed software.
# First, the user must activate a copy of their license, which has previously been granted to them through the Grant system.
# Authenticated as user
curl -XPOST -H "Authorization: Bearer $accessToken" -H "Content-Type: content-type: application/json; charset=utf-8" "$host/api/licenses" -d '{
"items": [
{
"product": {
"id": "example-license",
"category": "example-license",
"provider": "example"
}
}
]
}'
# {
# "responses": [
# {
# "id": "56231"
# }
# ]
# }
# This license can now freely be used in Jobs
curl -XPOST -H "Authorization: Bearer $accessToken" -H "Content-Type: content-type: application/json; charset=utf-8" "$host/api/jobs" -d '{
"items": [
{
"application": {
"name": "acme-licensed-software",
"version": "1.0.0"
},
"product": {
"id": "example-compute",
"category": "example-compute",
"provider": "example"
},
"name": null,
"replicas": 1,
"allowDuplicateJob": false,
"parameters": {
"license": {
"type": "license_server",
"id": "56231"
}
},
"resources": null,
"timeAllocation": null,
"openedFile": null,
"restartOnExit": null,
"sshEnabled": null
}
]
}'
# {
# "responses": [
# {
# "id": "55123"
# }
# ]
# }
Communication Flow: Visual
Example: Using a remote desktop Application (VNC)¶
Frequency of use | Common |
---|---|
Actors |
|
Communication Flow: Kotlin
/* In this example, the user will create a Job which uses an Application that exposes a VNC interface */
Jobs.create.call(
bulkRequestOf(JobSpecification(
allowDuplicateJob = false,
application = NameAndVersion(
name = "acme-remote-desktop",
version = "1.0.0",
),
name = null,
openedFile = null,
parameters = null,
product = ProductReference(
category = "example-compute",
id = "example-compute",
provider = "example",
),
replicas = 1,
resources = null,
restartOnExit = null,
sshEnabled = null,
timeAllocation = null,
)),
user
).orThrow()
/*
BulkResponse(
responses = listOf(FindByStringId(
id = "51231",
)),
)
*/
Jobs.openInteractiveSession.call(
bulkRequestOf(JobsOpenInteractiveSessionRequestItem(
id = "51231",
rank = 0,
sessionType = InteractiveSessionType.VNC,
)),
user
).orThrow()
/*
BulkResponse(
responses = listOf(OpenSessionWithProvider(
providerDomain = "provider.example.com",
providerId = "example",
session = OpenSession.Vnc(
domainOverride = null,
jobId = "51231",
password = "e7ccc6e0870250073286c44545e6b41820d1db7f",
rank = 0,
url = "vnc-69521c85-4811-43e6-9de3-2a48614d04ab.provider.example.com",
),
)),
)
*/
/* The user can now connect to the remote desktop using the VNC protocol with the above details */
/* NOTE: UCloud expects this to support the VNC over WebSockets, as it allows for a connection to be
established directly from the browser.
You can read more about the protocol here: https://novnc.com */
Communication Flow: Curl
# ------------------------------------------------------------------------------------------------------
# $host is the UCloud instance to contact. Example: 'http://localhost:8080' or 'https://cloud.sdu.dk'
# $accessToken is a valid access-token issued by UCloud
# ------------------------------------------------------------------------------------------------------
# In this example, the user will create a Job which uses an Application that exposes a VNC interface
# Authenticated as user
curl -XPOST -H "Authorization: Bearer $accessToken" -H "Content-Type: content-type: application/json; charset=utf-8" "$host/api/jobs" -d '{
"items": [
{
"application": {
"name": "acme-remote-desktop",
"version": "1.0.0"
},
"product": {
"id": "example-compute",
"category": "example-compute",
"provider": "example"
},
"name": null,
"replicas": 1,
"allowDuplicateJob": false,
"parameters": null,
"resources": null,
"timeAllocation": null,
"openedFile": null,
"restartOnExit": null,
"sshEnabled": null
}
]
}'
# {
# "responses": [
# {
# "id": "51231"
# }
# ]
# }
curl -XPOST -H "Authorization: Bearer $accessToken" -H "Content-Type: content-type: application/json; charset=utf-8" "$host/api/jobs/interactiveSession" -d '{
"items": [
{
"id": "51231",
"rank": 0,
"sessionType": "VNC"
}
]
}'
# {
# "responses": [
# {
# "providerDomain": "provider.example.com",
# "providerId": "example",
# "session": {
# "type": "vnc",
# "jobId": "51231",
# "rank": 0,
# "url": "vnc-69521c85-4811-43e6-9de3-2a48614d04ab.provider.example.com",
# "password": "e7ccc6e0870250073286c44545e6b41820d1db7f",
# "domainOverride": null
# }
# }
# ]
# }
# The user can now connect to the remote desktop using the VNC protocol with the above details
# NOTE: UCloud expects this to support the VNC over WebSockets, as it allows for a connection to be
# established directly from the browser.
#
# You can read more about the protocol here: https://novnc.com
Communication Flow: Visual
Example: Using a web Application¶
Frequency of use | Common |
---|---|
Actors |
|
Communication Flow: Kotlin
/* In this example, the user will create a Job which uses an Application that exposes a web interface */
Jobs.create.call(
bulkRequestOf(JobSpecification(
allowDuplicateJob = false,
application = NameAndVersion(
name = "acme-web-application",
version = "1.0.0",
),
name = null,
openedFile = null,
parameters = null,
product = ProductReference(
category = "example-compute",
id = "example-compute",
provider = "example",
),
replicas = 1,
resources = null,
restartOnExit = null,
sshEnabled = null,
timeAllocation = null,
)),
user
).orThrow()
/*
BulkResponse(
responses = listOf(FindByStringId(
id = "62342",
)),
)
*/
Jobs.openInteractiveSession.call(
bulkRequestOf(JobsOpenInteractiveSessionRequestItem(
id = "62342",
rank = 0,
sessionType = InteractiveSessionType.WEB,
)),
user
).orThrow()
/*
BulkResponse(
responses = listOf(OpenSessionWithProvider(
providerDomain = "provider.example.com",
providerId = "example",
session = OpenSession.Web(
domainOverride = null,
jobId = "62342",
rank = 0,
redirectClientTo = "app-gateway.provider.example.com?token=aa2dd29a-fe83-4201-b28e-fe211f94ac9d",
),
)),
)
*/
/* The user should now proceed to the link provided in the response */
Communication Flow: Curl
# ------------------------------------------------------------------------------------------------------
# $host is the UCloud instance to contact. Example: 'http://localhost:8080' or 'https://cloud.sdu.dk'
# $accessToken is a valid access-token issued by UCloud
# ------------------------------------------------------------------------------------------------------
# In this example, the user will create a Job which uses an Application that exposes a web interface
# Authenticated as user
curl -XPOST -H "Authorization: Bearer $accessToken" -H "Content-Type: content-type: application/json; charset=utf-8" "$host/api/jobs" -d '{
"items": [
{
"application": {
"name": "acme-web-application",
"version": "1.0.0"
},
"product": {
"id": "example-compute",
"category": "example-compute",
"provider": "example"
},
"name": null,
"replicas": 1,
"allowDuplicateJob": false,
"parameters": null,
"resources": null,
"timeAllocation": null,
"openedFile": null,
"restartOnExit": null,
"sshEnabled": null
}
]
}'
# {
# "responses": [
# {
# "id": "62342"
# }
# ]
# }
curl -XPOST -H "Authorization: Bearer $accessToken" -H "Content-Type: content-type: application/json; charset=utf-8" "$host/api/jobs/interactiveSession" -d '{
"items": [
{
"id": "62342",
"rank": 0,
"sessionType": "WEB"
}
]
}'
# {
# "responses": [
# {
# "providerDomain": "provider.example.com",
# "providerId": "example",
# "session": {
# "type": "web",
# "jobId": "62342",
# "rank": 0,
# "redirectClientTo": "app-gateway.provider.example.com?token=aa2dd29a-fe83-4201-b28e-fe211f94ac9d",
# "domainOverride": null
# }
# }
# ]
# }
# The user should now proceed to the link provided in the response
Communication Flow: Visual
Example: Losing access to resources¶
Frequency of use | Common |
---|---|
Actors |
|
Communication Flow: Kotlin
/* In this example, the user will create a Job using shared resources. Later in the example, the user
will lose access to these resources. */
/* When the user starts the Job, they have access to some shared files. These are used in theJob (see the resources section). */
Jobs.create.call(
bulkRequestOf(JobSpecification(
allowDuplicateJob = false,
application = NameAndVersion(
name = "acme-web-application",
version = "1.0.0",
),
name = null,
openedFile = null,
parameters = null,
product = ProductReference(
category = "example-compute",
id = "example-compute",
provider = "example",
),
replicas = 1,
resources = listOf(AppParameterValue.File(
path = "/12512/shared",
readOnly = false,
)),
restartOnExit = null,
sshEnabled = null,
timeAllocation = null,
)),
user
).orThrow()
/*
BulkResponse(
responses = listOf(FindByStringId(
id = "62348",
)),
)
*/
/* The Job is now running */
/* However, a few minutes later the share is revoked. UCloud automatically kills the Job a few minutes
after this. The status now reflects this. */
Jobs.retrieve.call(
ResourceRetrieveRequest(
flags = JobIncludeFlags(
filterApplication = null,
filterCreatedAfter = null,
filterCreatedBefore = null,
filterCreatedBy = null,
filterIds = null,
filterProductCategory = null,
filterProductId = null,
filterProvider = null,
filterProviderIds = null,
filterState = null,
hideProductCategory = null,
hideProductId = null,
hideProvider = null,
includeApplication = null,
includeOthers = false,
includeParameters = null,
includeProduct = false,
includeSupport = false,
includeUpdates = false,
),
id = "62348",
),
user
).orThrow()
/*
Job(
createdAt = 1633588976235,
id = "62348",
output = null,
owner = ResourceOwner(
createdBy = "user",
project = null,
),
permissions = null,
specification = JobSpecification(
allowDuplicateJob = false,
application = NameAndVersion(
name = "acme-web-application",
version = "1.0.0",
),
name = null,
openedFile = null,
parameters = null,
product = ProductReference(
category = "example-compute",
id = "example-compute",
provider = "example",
),
replicas = 1,
resources = listOf(AppParameterValue.File(
path = "/12512/shared",
readOnly = false,
)),
restartOnExit = null,
sshEnabled = null,
timeAllocation = null,
),
status = JobStatus(
allowRestart = false,
expiresAt = null,
jobParametersJson = null,
resolvedApplication = null,
resolvedProduct = null,
resolvedSupport = null,
startedAt = null,
state = JobState.SUCCESS,
),
updates = listOf(JobUpdate(
allowRestart = null,
expectedDifferentState = null,
expectedState = null,
newMounts = null,
newTimeAllocation = null,
outputFolder = null,
state = JobState.IN_QUEUE,
status = "Your job is now waiting in the queue!",
timestamp = 1633588976235,
), JobUpdate(
allowRestart = null,
expectedDifferentState = null,
expectedState = null,
newMounts = null,
newTimeAllocation = null,
outputFolder = null,
state = JobState.RUNNING,
status = "Your job is now running!",
timestamp = 1633588981235,
), JobUpdate(
allowRestart = null,
expectedDifferentState = null,
expectedState = null,
newMounts = null,
newTimeAllocation = null,
outputFolder = null,
state = JobState.SUCCESS,
status = "Your job has been terminated (Lost permissions)",
timestamp = 1633589101235,
)),
providerGeneratedId = "62348",
)
*/
Communication Flow: Curl
# ------------------------------------------------------------------------------------------------------
# $host is the UCloud instance to contact. Example: 'http://localhost:8080' or 'https://cloud.sdu.dk'
# $accessToken is a valid access-token issued by UCloud
# ------------------------------------------------------------------------------------------------------
# In this example, the user will create a Job using shared resources. Later in the example, the user
# will lose access to these resources.
# When the user starts the Job, they have access to some shared files. These are used in theJob (see the resources section).
# Authenticated as user
curl -XPOST -H "Authorization: Bearer $accessToken" -H "Content-Type: content-type: application/json; charset=utf-8" "$host/api/jobs" -d '{
"items": [
{
"application": {
"name": "acme-web-application",
"version": "1.0.0"
},
"product": {
"id": "example-compute",
"category": "example-compute",
"provider": "example"
},
"name": null,
"replicas": 1,
"allowDuplicateJob": false,
"parameters": null,
"resources": [
{
"type": "file",
"path": "/12512/shared",
"readOnly": false
}
],
"timeAllocation": null,
"openedFile": null,
"restartOnExit": null,
"sshEnabled": null
}
]
}'
# {
# "responses": [
# {
# "id": "62348"
# }
# ]
# }
# The Job is now running
# However, a few minutes later the share is revoked. UCloud automatically kills the Job a few minutes
# after this. The status now reflects this.
curl -XGET -H "Authorization: Bearer $accessToken" "$host/api/jobs/retrieve?includeProduct=false&includeOthers=false&includeUpdates=false&includeSupport=false&id=62348"
# {
# "id": "62348",
# "owner": {
# "createdBy": "user",
# "project": null
# },
# "updates": [
# {
# "state": "IN_QUEUE",
# "outputFolder": null,
# "status": "Your job is now waiting in the queue!",
# "expectedState": null,
# "expectedDifferentState": null,
# "newTimeAllocation": null,
# "allowRestart": null,
# "newMounts": null,
# "timestamp": 1633588976235
# },
# {
# "state": "RUNNING",
# "outputFolder": null,
# "status": "Your job is now running!",
# "expectedState": null,
# "expectedDifferentState": null,
# "newTimeAllocation": null,
# "allowRestart": null,
# "newMounts": null,
# "timestamp": 1633588981235
# },
# {
# "state": "SUCCESS",
# "outputFolder": null,
# "status": "Your job has been terminated (Lost permissions)",
# "expectedState": null,
# "expectedDifferentState": null,
# "newTimeAllocation": null,
# "allowRestart": null,
# "newMounts": null,
# "timestamp": 1633589101235
# }
# ],
# "specification": {
# "application": {
# "name": "acme-web-application",
# "version": "1.0.0"
# },
# "product": {
# "id": "example-compute",
# "category": "example-compute",
# "provider": "example"
# },
# "name": null,
# "replicas": 1,
# "allowDuplicateJob": false,
# "parameters": null,
# "resources": [
# {
# "type": "file",
# "path": "/12512/shared",
# "readOnly": false
# }
# ],
# "timeAllocation": null,
# "openedFile": null,
# "restartOnExit": null,
# "sshEnabled": null
# },
# "status": {
# "state": "SUCCESS",
# "jobParametersJson": null,
# "startedAt": null,
# "expiresAt": null,
# "resolvedApplication": null,
# "resolvedSupport": null,
# "resolvedProduct": null,
# "allowRestart": false
# },
# "createdAt": 1633588976235,
# "output": null,
# "permissions": null
# }
Communication Flow: Visual
Example: Running out of compute credits¶
Frequency of use | Common |
---|---|
Actors |
|
Communication Flow: Kotlin
/* In this example, the user will create a Job and eventually run out of compute credits. */
/* When the user creates the Job, they have enough credits */
AccountingV2.browseWallets.call(
AccountingV2.BrowseWallets.Request(
childrenQuery = null,
consistency = null,
filterType = null,
includeChildren = false,
itemsPerPage = null,
itemsToSkip = null,
next = null,
),
user
).orThrow()
/*
PageV2(
items = listOf(WalletV2(
allocationGroups = listOf(AllocationGroupWithParent(
group = AllocationGroup(
allocations = listOf(AllocationGroup.Alloc(
endDate = 1664865776235,
grantedIn = null,
id = 12541154,
quota = 500000000,
retiredUsage = null,
startDate = 1633329776235,
)),
id = 1,
usage = 499000000,
),
parent = ParentOrChildWallet(
pi = "user",
projectId = null,
projectTitle = "Root",
),
)),
children = null,
lastSignificantUpdateAt = 0,
localUsage = 499000000,
maxUsable = 100000,
owner = WalletOwner.User(
username = "user",
),
paysFor = ProductCategory(
accountingFrequency = AccountingFrequency.PERIODIC_MINUTE,
accountingUnit = AccountingUnit(
displayFrequencySuffix = false,
floatingPoint = true,
name = "DKK",
namePlural = "DKK",
),
allowSubAllocations = true,
freeToUse = false,
name = "example-compute",
productType = ProductType.COMPUTE,
provider = "example",
),
quota = 500000000,
totalAllocated = 0,
totalUsage = 499000000,
)),
itemsPerPage = 50,
next = null,
)
*/
/* 📝 Note: at this point the user has a very low amount of credits remaining.
It will only last a couple of minutes. */
Jobs.create.call(
bulkRequestOf(JobSpecification(
allowDuplicateJob = false,
application = NameAndVersion(
name = "acme-web-application",
version = "1.0.0",
),
name = null,
openedFile = null,
parameters = null,
product = ProductReference(
category = "example-compute",
id = "example-compute",
provider = "example",
),
replicas = 1,
resources = null,
restartOnExit = null,
sshEnabled = null,
timeAllocation = null,
)),
user
).orThrow()
/*
BulkResponse(
responses = listOf(FindByStringId(
id = "62348",
)),
)
*/
/* The Job is now running */
/* However, a few minutes later the Job is automatically killed by UCloud. The status now reflects this. */
Jobs.retrieve.call(
ResourceRetrieveRequest(
flags = JobIncludeFlags(
filterApplication = null,
filterCreatedAfter = null,
filterCreatedBefore = null,
filterCreatedBy = null,
filterIds = null,
filterProductCategory = null,
filterProductId = null,
filterProvider = null,
filterProviderIds = null,
filterState = null,
hideProductCategory = null,
hideProductId = null,
hideProvider = null,
includeApplication = null,
includeOthers = false,
includeParameters = null,
includeProduct = false,
includeSupport = false,
includeUpdates = false,
),
id = "62348",
),
user
).orThrow()
/*
Job(
createdAt = 1633588976235,
id = "62348",
output = null,
owner = ResourceOwner(
createdBy = "user",
project = null,
),
permissions = null,
specification = JobSpecification(
allowDuplicateJob = false,
application = NameAndVersion(
name = "acme-web-application",
version = "1.0.0",
),
name = null,
openedFile = null,
parameters = null,
product = ProductReference(
category = "example-compute",
id = "example-compute",
provider = "example",
),
replicas = 1,
resources = null,
restartOnExit = null,
sshEnabled = null,
timeAllocation = null,
),
status = JobStatus(
allowRestart = false,
expiresAt = null,
jobParametersJson = null,
resolvedApplication = null,
resolvedProduct = null,
resolvedSupport = null,
startedAt = null,
state = JobState.SUCCESS,
),
updates = listOf(JobUpdate(
allowRestart = null,
expectedDifferentState = null,
expectedState = null,
newMounts = null,
newTimeAllocation = null,
outputFolder = null,
state = JobState.IN_QUEUE,
status = "Your job is now waiting in the queue!",
timestamp = 1633588976235,
), JobUpdate(
allowRestart = null,
expectedDifferentState = null,
expectedState = null,
newMounts = null,
newTimeAllocation = null,
outputFolder = null,
state = JobState.RUNNING,
status = "Your job is now running!",
timestamp = 1633588981235,
), JobUpdate(
allowRestart = null,
expectedDifferentState = null,
expectedState = null,
newMounts = null,
newTimeAllocation = null,
outputFolder = null,
state = JobState.SUCCESS,
status = "Your job has been terminated (No more credits)",
timestamp = 1633589101235,
)),
providerGeneratedId = "62348",
)
*/
Communication Flow: Curl
# ------------------------------------------------------------------------------------------------------
# $host is the UCloud instance to contact. Example: 'http://localhost:8080' or 'https://cloud.sdu.dk'
# $accessToken is a valid access-token issued by UCloud
# ------------------------------------------------------------------------------------------------------
# In this example, the user will create a Job and eventually run out of compute credits.
# When the user creates the Job, they have enough credits
# Authenticated as user
curl -XGET -H "Authorization: Bearer $accessToken" "$host/api/accounting/v2/browseWallets?includeChildren=false"
# {
# "itemsPerPage": 50,
# "items": [
# {
# "owner": {
# "type": "user",
# "username": "user"
# },
# "paysFor": {
# "name": "example-compute",
# "provider": "example",
# "productType": "COMPUTE",
# "accountingUnit": {
# "name": "DKK",
# "namePlural": "DKK",
# "floatingPoint": true,
# "displayFrequencySuffix": false
# },
# "accountingFrequency": "PERIODIC_MINUTE",
# "freeToUse": false,
# "allowSubAllocations": true
# },
# "allocationGroups": [
# {
# "parent": {
# "projectId": null,
# "projectTitle": "Root",
# "pi": "user"
# },
# "group": {
# "id": 1,
# "allocations": [
# {
# "id": 12541154,
# "startDate": 1633329776235,
# "endDate": 1664865776235,
# "quota": 500000000,
# "grantedIn": null,
# "retiredUsage": null
# }
# ],
# "usage": 499000000
# }
# }
# ],
# "children": null,
# "totalUsage": 499000000,
# "localUsage": 499000000,
# "maxUsable": 100000,
# "quota": 500000000,
# "totalAllocated": 0,
# "lastSignificantUpdateAt": 0
# }
# ],
# "next": null
# }
# 📝 Note: at this point the user has a very low amount of credits remaining.
# It will only last a couple of minutes.
curl -XPOST -H "Authorization: Bearer $accessToken" -H "Content-Type: content-type: application/json; charset=utf-8" "$host/api/jobs" -d '{
"items": [
{
"application": {
"name": "acme-web-application",
"version": "1.0.0"
},
"product": {
"id": "example-compute",
"category": "example-compute",
"provider": "example"
},
"name": null,
"replicas": 1,
"allowDuplicateJob": false,
"parameters": null,
"resources": null,
"timeAllocation": null,
"openedFile": null,
"restartOnExit": null,
"sshEnabled": null
}
]
}'
# {
# "responses": [
# {
# "id": "62348"
# }
# ]
# }
# The Job is now running
# However, a few minutes later the Job is automatically killed by UCloud. The status now reflects this.
curl -XGET -H "Authorization: Bearer $accessToken" "$host/api/jobs/retrieve?includeProduct=false&includeOthers=false&includeUpdates=false&includeSupport=false&id=62348"
# {
# "id": "62348",
# "owner": {
# "createdBy": "user",
# "project": null
# },
# "updates": [
# {
# "state": "IN_QUEUE",
# "outputFolder": null,
# "status": "Your job is now waiting in the queue!",
# "expectedState": null,
# "expectedDifferentState": null,
# "newTimeAllocation": null,
# "allowRestart": null,
# "newMounts": null,
# "timestamp": 1633588976235
# },
# {
# "state": "RUNNING",
# "outputFolder": null,
# "status": "Your job is now running!",
# "expectedState": null,
# "expectedDifferentState": null,
# "newTimeAllocation": null,
# "allowRestart": null,
# "newMounts": null,
# "timestamp": 1633588981235
# },
# {
# "state": "SUCCESS",
# "outputFolder": null,
# "status": "Your job has been terminated (No more credits)",
# "expectedState": null,
# "expectedDifferentState": null,
# "newTimeAllocation": null,
# "allowRestart": null,
# "newMounts": null,
# "timestamp": 1633589101235
# }
# ],
# "specification": {
# "application": {
# "name": "acme-web-application",
# "version": "1.0.0"
# },
# "product": {
# "id": "example-compute",
# "category": "example-compute",
# "provider": "example"
# },
# "name": null,
# "replicas": 1,
# "allowDuplicateJob": false,
# "parameters": null,
# "resources": null,
# "timeAllocation": null,
# "openedFile": null,
# "restartOnExit": null,
# "sshEnabled": null
# },
# "status": {
# "state": "SUCCESS",
# "jobParametersJson": null,
# "startedAt": null,
# "expiresAt": null,
# "resolvedApplication": null,
# "resolvedSupport": null,
# "resolvedProduct": null,
# "allowRestart": false
# },
# "createdAt": 1633588976235,
# "output": null,
# "permissions": null
# }
Communication Flow: Visual
Example: Extending a Job and terminating it early¶
Frequency of use | Common |
---|---|
Pre-conditions |
|
Actors |
|
Communication Flow: Kotlin
/* In this example we will show how a user can extend the duration of a Job. Later in the same
example, we show how the user can cancel it early. */
Jobs.create.call(
bulkRequestOf(JobSpecification(
allowDuplicateJob = false,
application = NameAndVersion(
name = "acme-web-application",
version = "1.0.0",
),
name = null,
openedFile = null,
parameters = null,
product = ProductReference(
category = "example-compute",
id = "example-compute",
provider = "example",
),
replicas = 1,
resources = null,
restartOnExit = null,
sshEnabled = null,
timeAllocation = SimpleDuration(
hours = 5,
minutes = 0,
seconds = 0,
),
)),
user
).orThrow()
/*
BulkResponse(
responses = listOf(FindByStringId(
id = "62348",
)),
)
*/
/* The Job is initially allocated with a duration of 5 hours. We can check when it expires by retrieving the Job */
Jobs.retrieve.call(
ResourceRetrieveRequest(
flags = JobIncludeFlags(
filterApplication = null,
filterCreatedAfter = null,
filterCreatedBefore = null,
filterCreatedBy = null,
filterIds = null,
filterProductCategory = null,
filterProductId = null,
filterProvider = null,
filterProviderIds = null,
filterState = null,
hideProductCategory = null,
hideProductId = null,
hideProvider = null,
includeApplication = null,
includeOthers = false,
includeParameters = null,
includeProduct = false,
includeSupport = false,
includeUpdates = false,
),
id = "62348",
),
user
).orThrow()
/*
Job(
createdAt = 1633329776235,
id = "62348",
output = null,
owner = ResourceOwner(
createdBy = "user",
project = null,
),
permissions = null,
specification = JobSpecification(
allowDuplicateJob = false,
application = NameAndVersion(
name = "acme-web-application",
version = "1.0.0",
),
name = null,
openedFile = null,
parameters = null,
product = ProductReference(
category = "example-compute",
id = "example-compute",
provider = "example",
),
replicas = 1,
resources = null,
restartOnExit = null,
sshEnabled = null,
timeAllocation = SimpleDuration(
hours = 5,
minutes = 0,
seconds = 0,
),
),
status = JobStatus(
allowRestart = false,
expiresAt = 1633347776235,
jobParametersJson = null,
resolvedApplication = null,
resolvedProduct = null,
resolvedSupport = null,
startedAt = null,
state = JobState.RUNNING,
),
updates = listOf(JobUpdate(
allowRestart = null,
expectedDifferentState = null,
expectedState = null,
newMounts = null,
newTimeAllocation = null,
outputFolder = null,
state = JobState.IN_QUEUE,
status = "Your job is now waiting in the queue!",
timestamp = 1633329776235,
), JobUpdate(
allowRestart = null,
expectedDifferentState = null,
expectedState = null,
newMounts = null,
newTimeAllocation = null,
outputFolder = null,
state = JobState.RUNNING,
status = "Your job is now running!",
timestamp = 1633329781235,
)),
providerGeneratedId = "62348",
)
*/
/* We can extend the duration quite easily */
Jobs.extend.call(
bulkRequestOf(JobsExtendRequestItem(
jobId = "62348",
requestedTime = SimpleDuration(
hours = 1,
minutes = 0,
seconds = 0,
),
)),
user
).orThrow()
/*
BulkResponse(
responses = listOf(Unit),
)
*/
/* The new expiration is reflected if we retrieve it again */
Jobs.retrieve.call(
ResourceRetrieveRequest(
flags = JobIncludeFlags(
filterApplication = null,
filterCreatedAfter = null,
filterCreatedBefore = null,
filterCreatedBy = null,
filterIds = null,
filterProductCategory = null,
filterProductId = null,
filterProvider = null,
filterProviderIds = null,
filterState = null,
hideProductCategory = null,
hideProductId = null,
hideProvider = null,
includeApplication = null,
includeOthers = false,
includeParameters = null,
includeProduct = false,
includeSupport = false,
includeUpdates = false,
),
id = "62348",
),
user
).orThrow()
/*
Job(
createdAt = 1633329776235,
id = "62348",
output = null,
owner = ResourceOwner(
createdBy = "user",
project = null,
),
permissions = null,
specification = JobSpecification(
allowDuplicateJob = false,
application = NameAndVersion(
name = "acme-web-application",
version = "1.0.0",
),
name = null,
openedFile = null,
parameters = null,
product = ProductReference(
category = "example-compute",
id = "example-compute",
provider = "example",
),
replicas = 1,
resources = null,
restartOnExit = null,
sshEnabled = null,
timeAllocation = SimpleDuration(
hours = 5,
minutes = 0,
seconds = 0,
),
),
status = JobStatus(
allowRestart = false,
expiresAt = 1633351376235,
jobParametersJson = null,
resolvedApplication = null,
resolvedProduct = null,
resolvedSupport = null,
startedAt = null,
state = JobState.RUNNING,
),
updates = listOf(JobUpdate(
allowRestart = null,
expectedDifferentState = null,
expectedState = null,
newMounts = null,
newTimeAllocation = null,
outputFolder = null,
state = JobState.IN_QUEUE,
status = "Your job is now waiting in the queue!",
timestamp = 1633329776235,
), JobUpdate(
allowRestart = null,
expectedDifferentState = null,
expectedState = null,
newMounts = null,
newTimeAllocation = null,
outputFolder = null,
state = JobState.RUNNING,
status = "Your job is now running!",
timestamp = 1633329781235,
)),
providerGeneratedId = "62348",
)
*/
/* If the user decides that they are done with the Job early, then they can simply terminate it */
Jobs.terminate.call(
bulkRequestOf(FindByStringId(
id = "62348",
)),
user
).orThrow()
/*
BulkResponse(
responses = listOf(Unit),
)
*/
/* This termination is reflected in the status (and updates) */
Jobs.retrieve.call(
ResourceRetrieveRequest(
flags = JobIncludeFlags(
filterApplication = null,
filterCreatedAfter = null,
filterCreatedBefore = null,
filterCreatedBy = null,
filterIds = null,
filterProductCategory = null,
filterProductId = null,
filterProvider = null,
filterProviderIds = null,
filterState = null,
hideProductCategory = null,
hideProductId = null,
hideProvider = null,
includeApplication = null,
includeOthers = false,
includeParameters = null,
includeProduct = false,
includeSupport = false,
includeUpdates = false,
),
id = "62348",
),
user
).orThrow()
/*
Job(
createdAt = 1633329776235,
id = "62348",
output = null,
owner = ResourceOwner(
createdBy = "user",
project = null,
),
permissions = null,
specification = JobSpecification(
allowDuplicateJob = false,
application = NameAndVersion(
name = "acme-web-application",
version = "1.0.0",
),
name = null,
openedFile = null,
parameters = null,
product = ProductReference(
category = "example-compute",
id = "example-compute",
provider = "example",
),
replicas = 1,
resources = null,
restartOnExit = null,
sshEnabled = null,
timeAllocation = SimpleDuration(
hours = 5,
minutes = 0,
seconds = 0,
),
),
status = JobStatus(
allowRestart = false,
expiresAt = null,
jobParametersJson = null,
resolvedApplication = null,
resolvedProduct = null,
resolvedSupport = null,
startedAt = null,
state = JobState.SUCCESS,
),
updates = listOf(JobUpdate(
allowRestart = null,
expectedDifferentState = null,
expectedState = null,
newMounts = null,
newTimeAllocation = null,
outputFolder = null,
state = JobState.IN_QUEUE,
status = "Your job is now waiting in the queue!",
timestamp = 1633329776235,
), JobUpdate(
allowRestart = null,
expectedDifferentState = null,
expectedState = null,
newMounts = null,
newTimeAllocation = null,
outputFolder = null,
state = JobState.RUNNING,
status = "Your job is now running!",
timestamp = 1633329781235,
), JobUpdate(
allowRestart = null,
expectedDifferentState = null,
expectedState = null,
newMounts = null,
newTimeAllocation = null,
outputFolder = null,
state = JobState.SUCCESS,
status = "Your job has been cancelled!",
timestamp = 1633336981235,
)),
providerGeneratedId = "62348",
)
*/
Communication Flow: Curl
# ------------------------------------------------------------------------------------------------------
# $host is the UCloud instance to contact. Example: 'http://localhost:8080' or 'https://cloud.sdu.dk'
# $accessToken is a valid access-token issued by UCloud
# ------------------------------------------------------------------------------------------------------
# In this example we will show how a user can extend the duration of a Job. Later in the same
# example, we show how the user can cancel it early.
# Authenticated as user
curl -XPOST -H "Authorization: Bearer $accessToken" -H "Content-Type: content-type: application/json; charset=utf-8" "$host/api/jobs" -d '{
"items": [
{
"application": {
"name": "acme-web-application",
"version": "1.0.0"
},
"product": {
"id": "example-compute",
"category": "example-compute",
"provider": "example"
},
"name": null,
"replicas": 1,
"allowDuplicateJob": false,
"parameters": null,
"resources": null,
"timeAllocation": {
"hours": 5,
"minutes": 0,
"seconds": 0
},
"openedFile": null,
"restartOnExit": null,
"sshEnabled": null
}
]
}'
# {
# "responses": [
# {
# "id": "62348"
# }
# ]
# }
# The Job is initially allocated with a duration of 5 hours. We can check when it expires by retrieving the Job
curl -XGET -H "Authorization: Bearer $accessToken" "$host/api/jobs/retrieve?includeProduct=false&includeOthers=false&includeUpdates=false&includeSupport=false&id=62348"
# {
# "id": "62348",
# "owner": {
# "createdBy": "user",
# "project": null
# },
# "updates": [
# {
# "state": "IN_QUEUE",
# "outputFolder": null,
# "status": "Your job is now waiting in the queue!",
# "expectedState": null,
# "expectedDifferentState": null,
# "newTimeAllocation": null,
# "allowRestart": null,
# "newMounts": null,
# "timestamp": 1633329776235
# },
# {
# "state": "RUNNING",
# "outputFolder": null,
# "status": "Your job is now running!",
# "expectedState": null,
# "expectedDifferentState": null,
# "newTimeAllocation": null,
# "allowRestart": null,
# "newMounts": null,
# "timestamp": 1633329781235
# }
# ],
# "specification": {
# "application": {
# "name": "acme-web-application",
# "version": "1.0.0"
# },
# "product": {
# "id": "example-compute",
# "category": "example-compute",
# "provider": "example"
# },
# "name": null,
# "replicas": 1,
# "allowDuplicateJob": false,
# "parameters": null,
# "resources": null,
# "timeAllocation": {
# "hours": 5,
# "minutes": 0,
# "seconds": 0
# },
# "openedFile": null,
# "restartOnExit": null,
# "sshEnabled": null
# },
# "status": {
# "state": "RUNNING",
# "jobParametersJson": null,
# "startedAt": null,
# "expiresAt": 1633347776235,
# "resolvedApplication": null,
# "resolvedSupport": null,
# "resolvedProduct": null,
# "allowRestart": false
# },
# "createdAt": 1633329776235,
# "output": null,
# "permissions": null
# }
# We can extend the duration quite easily
curl -XPOST -H "Authorization: Bearer $accessToken" -H "Content-Type: content-type: application/json; charset=utf-8" "$host/api/jobs/extend" -d '{
"items": [
{
"jobId": "62348",
"requestedTime": {
"hours": 1,
"minutes": 0,
"seconds": 0
}
}
]
}'
# {
# "responses": [
# {
# }
# ]
# }
# The new expiration is reflected if we retrieve it again
curl -XGET -H "Authorization: Bearer $accessToken" "$host/api/jobs/retrieve?includeProduct=false&includeOthers=false&includeUpdates=false&includeSupport=false&id=62348"
# {
# "id": "62348",
# "owner": {
# "createdBy": "user",
# "project": null
# },
# "updates": [
# {
# "state": "IN_QUEUE",
# "outputFolder": null,
# "status": "Your job is now waiting in the queue!",
# "expectedState": null,
# "expectedDifferentState": null,
# "newTimeAllocation": null,
# "allowRestart": null,
# "newMounts": null,
# "timestamp": 1633329776235
# },
# {
# "state": "RUNNING",
# "outputFolder": null,
# "status": "Your job is now running!",
# "expectedState": null,
# "expectedDifferentState": null,
# "newTimeAllocation": null,
# "allowRestart": null,
# "newMounts": null,
# "timestamp": 1633329781235
# }
# ],
# "specification": {
# "application": {
# "name": "acme-web-application",
# "version": "1.0.0"
# },
# "product": {
# "id": "example-compute",
# "category": "example-compute",
# "provider": "example"
# },
# "name": null,
# "replicas": 1,
# "allowDuplicateJob": false,
# "parameters": null,
# "resources": null,
# "timeAllocation": {
# "hours": 5,
# "minutes": 0,
# "seconds": 0
# },
# "openedFile": null,
# "restartOnExit": null,
# "sshEnabled": null
# },
# "status": {
# "state": "RUNNING",
# "jobParametersJson": null,
# "startedAt": null,
# "expiresAt": 1633351376235,
# "resolvedApplication": null,
# "resolvedSupport": null,
# "resolvedProduct": null,
# "allowRestart": false
# },
# "createdAt": 1633329776235,
# "output": null,
# "permissions": null
# }
# If the user decides that they are done with the Job early, then they can simply terminate it
curl -XPOST -H "Authorization: Bearer $accessToken" -H "Content-Type: content-type: application/json; charset=utf-8" "$host/api/jobs/terminate" -d '{
"items": [
{
"id": "62348"
}
]
}'
# {
# "responses": [
# {
# }
# ]
# }
# This termination is reflected in the status (and updates)
curl -XGET -H "Authorization: Bearer $accessToken" "$host/api/jobs/retrieve?includeProduct=false&includeOthers=false&includeUpdates=false&includeSupport=false&id=62348"
# {
# "id": "62348",
# "owner": {
# "createdBy": "user",
# "project": null
# },
# "updates": [
# {
# "state": "IN_QUEUE",
# "outputFolder": null,
# "status": "Your job is now waiting in the queue!",
# "expectedState": null,
# "expectedDifferentState": null,
# "newTimeAllocation": null,
# "allowRestart": null,
# "newMounts": null,
# "timestamp": 1633329776235
# },
# {
# "state": "RUNNING",
# "outputFolder": null,
# "status": "Your job is now running!",
# "expectedState": null,
# "expectedDifferentState": null,
# "newTimeAllocation": null,
# "allowRestart": null,
# "newMounts": null,
# "timestamp": 1633329781235
# },
# {
# "state": "SUCCESS",
# "outputFolder": null,
# "status": "Your job has been cancelled!",
# "expectedState": null,
# "expectedDifferentState": null,
# "newTimeAllocation": null,
# "allowRestart": null,
# "newMounts": null,
# "timestamp": 1633336981235
# }
# ],
# "specification": {
# "application": {
# "name": "acme-web-application",
# "version": "1.0.0"
# },
# "product": {
# "id": "example-compute",
# "category": "example-compute",
# "provider": "example"
# },
# "name": null,
# "replicas": 1,
# "allowDuplicateJob": false,
# "parameters": null,
# "resources": null,
# "timeAllocation": {
# "hours": 5,
# "minutes": 0,
# "seconds": 0
# },
# "openedFile": null,
# "restartOnExit": null,
# "sshEnabled": null
# },
# "status": {
# "state": "SUCCESS",
# "jobParametersJson": null,
# "startedAt": null,
# "expiresAt": null,
# "resolvedApplication": null,
# "resolvedSupport": null,
# "resolvedProduct": null,
# "allowRestart": false
# },
# "createdAt": 1633329776235,
# "output": null,
# "permissions": null
# }
Communication Flow: Visual
Remote Procedure Calls¶
browse
¶
Browses the catalog of all Jobs
Request | Response | Error |
---|---|---|
ResourceBrowseRequest<JobIncludeFlags> |
PageV2<Job> |
CommonErrorMessage |
The catalog of all Job
s works through the normal pagination and the return value can be
adjusted through the flags. This can include filtering by a specific
application or looking at Job
s of a specific state, such as
(RUNNING
)[/docs/reference/dk.sdu.cloud.app.orchestrator.api.JobState.md).
follow
¶
Follow the progress of a job
Request | Response | Error |
---|---|---|
FindByStringId |
JobsFollowResponse |
CommonErrorMessage |
Opens a WebSocket subscription to receive updates about a job. These updates include:
retrieve
¶
Retrieves a single Job
Request | Response | Error |
---|---|---|
ResourceRetrieveRequest<JobIncludeFlags> |
Job |
CommonErrorMessage |
retrieveProducts
¶
Retrieve product support for all accessible providers
Request | Response | Error |
---|---|---|
Unit |
SupportByProvider<Product.Compute, ComputeSupport> |
CommonErrorMessage |
This endpoint will determine all providers that which the authenticated user has access to, in the current workspace. A user has access to a product, and thus a provider, if the product is either free or if the user has been granted credits to use the product.
See also:
retrieveUtilization
¶
Retrieve information about how busy the provider’s cluster currently is
Request | Response | Error |
---|---|---|
JobsRetrieveUtilizationRequest |
JobsRetrieveUtilizationResponse |
CommonErrorMessage |
This endpoint will return information about how busy a cluster is. This endpoint is only used for informational purposes. UCloud does not use this information for any accounting purposes.
search
¶
Searches the catalog of available resources
Request | Response | Error |
---|---|---|
ResourceSearchRequest<JobIncludeFlags> |
PageV2<Job> |
CommonErrorMessage |
create
¶
Creates one or more resources
Request | Response | Error |
---|---|---|
BulkRequest<JobSpecification> |
BulkResponse<FindByStringId> |
CommonErrorMessage |
extend
¶
Extend the duration of one or more jobs
Request | Response | Error |
---|---|---|
BulkRequest<JobsExtendRequestItem> |
BulkResponse<Unit> |
CommonErrorMessage |
This will extend the duration of one or more jobs in a bulk request. Extension of a job will add to the current deadline of a job. Note that not all providers support this features. Providers which do not support it will have it listed in their manifest. If a provider is asked to extend a deadline when not supported it will send back a 400 bad request.
This call makes no guarantee that all jobs are extended in a single transaction. If the provider supports it, then all requests made against a single provider should be made in a single transaction. Clients can determine if their extension request against a specific target was successful by checking if the time remaining of the job has been updated.
This call will return 2XX if all jobs have successfully been extended. The job will fail with a status code from the provider one the first extension which fails. UCloud will not attempt to extend more jobs after the first failure.
init
¶
Request (potential) initialization of resources
Request | Response | Error |
---|---|---|
Unit |
Unit |
CommonErrorMessage |
This request is sent by the client, if the client believes that initialization of resources might be needed. NOTE: This request might be sent even if initialization has already taken place. UCloud/Core does not check if initialization has already taken place, it simply validates the request.
openInteractiveSession
¶
Opens an interactive session (e.g. terminal, web or VNC)
Request | Response | Error |
---|---|---|
BulkRequest<JobsOpenInteractiveSessionRequestItem> |
BulkResponse<OpenSessionWithProvider> |
CommonErrorMessage |
suspend
¶
Suspend a job
Request | Response | Error |
---|---|---|
BulkRequest<FindByStringId> |
BulkResponse<Unit> |
CommonErrorMessage |
Suspends the job, putting it in a paused state. Not all compute backends support this operation. For compute backends which deals with Virtual Machines this will shutdown the Virtual Machine without deleting any data.
terminate
¶
Request job cancellation and destruction
Request | Response | Error |
---|---|---|
BulkRequest<FindByStringId> |
BulkResponse<Unit> |
CommonErrorMessage |
This call will request the cancellation of the associated jobs. This will make sure that the jobs are eventually stopped and resources are released. If the job is running a virtual machine, then the virtual machine will be stopped and destroyed. Persistent storage attached to the job will not be deleted only temporary data from the job will be deleted.
This call is asynchronous and the cancellation may not be immediately visible in the job. Progress can
be followed using the jobs.retrieve
, jobs.browse
, jobs.follow
calls.
unsuspend
¶
Unsuspends a job
Request | Response | Error |
---|---|---|
BulkRequest<FindByStringId> |
BulkResponse<Unit> |
CommonErrorMessage |
Reverses the effects of suspending a job. The job is expected to return back to an IN_QUEUE
or
RUNNING
state.
updateAcl
¶
Updates the ACL attached to a resource
Request | Response | Error |
---|---|---|
BulkRequest<UpdatedAcl> |
BulkResponse<Unit> |
CommonErrorMessage |
Data Models¶
Job
¶
A Job
in UCloud is the core abstraction used to describe a unit of computation.
data class Job(
val id: String,
val owner: ResourceOwner,
val updates: List<JobUpdate>,
val specification: JobSpecification,
val status: JobStatus,
val createdAt: Long,
val output: JobOutput?,
val permissions: ResourcePermissions?,
val providerGeneratedId: String?,
)
They provide users a way to run their computations through a workflow similar to their own workstations but scaling to
much bigger and more machines. In a simplified view, a Job
describes the following information:
The
Application
which the provider should/is/has run (see app-store)The input parameters required by a
Job
A reference to the appropriate compute infrastructure, this includes a reference to the provider
A Job
is started by a user request containing the specification
of a Job
This information is verified by the UCloud
orchestrator and passed to the provider referenced by the Job
itself. Assuming that the provider accepts this
information, the Job
is placed in its initial state, IN_QUEUE
. You can read more about the requirements of the
compute environment and how to launch the software
correctly here.
At this point, the provider has acted on this information by placing the Job
in its own equivalent of
a job queue. Once the provider realizes that the Job
is running, it will contact UCloud and place the
Job
in the RUNNING
state. This indicates to UCloud that log files can be retrieved and that interactive
interfaces (VNC
/WEB
) are available.
Once the Application
terminates at the provider, the provider will update the state to SUCCESS
. A Job
has
terminated successfully if no internal error occurred in UCloud and in the provider. This means that a Job
whose
software returns with a non-zero exit code is still considered successful. A Job
might, for example, be placed
in FAILURE
if the Application
crashed due to a hardware/scheduler failure. Both SUCCESS
or FAILURE
are terminal
state. Any Job
which is in a terminal state can no longer receive any updates or change its state.
At any point after the user submits the Job
, they may request cancellation of the Job
This will
stop the Job
, delete any
ephemeral resources and release
any bound resources.
Properties
id
: String
Unique identifier for this job.
String
UCloud guarantees that no other job, regardless of compute provider, has the same unique identifier.
owner
: ResourceOwner
A reference to the owner of this job
ResourceOwner
updates
: List<JobUpdate>
A list of status updates from the compute backend.
List<JobUpdate>
The status updates tell a story of what happened with the job. This list is ordered by the timestamp in ascending order. The current state of the job will always be the last element. updates
is guaranteed to always contain at least one element.
specification
: JobSpecification
The specification used to launch this job.
JobSpecification
This property is always available but must be explicitly requested.
status
: JobStatus
A summary of the `Job`'s current status
JobStatus
createdAt
: Long
Timestamp referencing when the request for creation was received by UCloud
Long
output
: JobOutput?
Information regarding the output of this job.
JobOutput?
permissions
: ResourcePermissions?
Permissions assigned to this resource
ResourcePermissions?
A null value indicates that permissions are not supported by this resource type.
providerGeneratedId
: String?
String?
JobSpecification
¶
A specification of a Job
data class JobSpecification(
val application: NameAndVersion,
val product: ProductReference,
val name: String?,
val replicas: Int?,
val allowDuplicateJob: Boolean?,
val parameters: JsonObject?,
val resources: List<AppParameterValue>?,
val timeAllocation: SimpleDuration?,
val openedFile: String?,
val restartOnExit: Boolean?,
val sshEnabled: Boolean?,
)
Properties
application
: NameAndVersion
A reference to the application which this job should execute
NameAndVersion
product
: ProductReference
A reference to the product that this job will be executed on
ProductReference
name
: String?
A name for this job assigned by the user.
String?
The name can help a user identify why and with which parameters a job was started. This value is suitable for display in user interfaces.
replicas
: Int?
The number of replicas to start this job in
Int?
The resources
supplied will be mounted in every replica. Some resources
might only be supported in an ‘exclusive use’ mode. This will cause the job to fail if replicas != 1
.
allowDuplicateJob
: Boolean?
Allows the job to be started even when a job is running in an identical configuration
Boolean?
By default, UCloud will prevent you from accidentally starting two jobs with identical configuration. This field must be set to true
to allow you to create two jobs with identical configuration.
parameters
: JsonObject?
Parameters which are consumed by the job
JsonObject?
The available parameters are defined by the application
. This attribute is not included by default unless includeParameters
is specified.
resources
: List<AppParameterValue>?
Additional resources which are made available into the job
List<AppParameterValue>?
This attribute is not included by default unless includeParameters
is specified. Note: Not all resources can be attached to a job. UCloud supports the following parameter types as resources:
file
peer
network
block_storage
ingress
timeAllocation
: SimpleDuration?
Time allocation for the job
SimpleDuration?
This value can be null
which signifies that the job should not (automatically) expire. Note that some providers do not support null
. When this value is not null
it means that the job will be terminated, regardless of result, after the duration has expired. Some providers support extended this duration via the extend
operation.
openedFile
: String?
An optional path to the file which the user selected with the "Open with..." feature.
String?
This value is null if the application is not launched using the “Open with…” feature. The value of this
is passed to the compute environment in a provider specific way. We encourage providers to expose this as
an environment variable named UCLOUD_OPEN_WITH_FILE
containing the absolute path of the file (in the
current environment). Remember that this path is the UCloud path to the file and not the provider’s path.
restartOnExit
: Boolean?
A flag which indicates if this job should be restarted on exit.
Boolean?
Not all providers support this feature and the Job will be rejected if not supported. This information can also be queried through the product support feature.
If this flag is true
then the Job will automatically be restarted when the provider notifies the
orchestrator about process termination. It is the responsibility of the orchestrator to notify the provider
about restarts. If the restarts are triggered by the provider, then the provider must not notify the
orchestrator about the termination. The orchestrator will trigger a new create
request in a timely manner.
The orchestrator decides when to trigger a new create
. For example, if a process is terminating often,
then the orchestrator might decide to wait before issuing a new create
.
sshEnabled
: Boolean?
A flag which indicates that this job should use the built-in SSH functionality of the application/provider
Boolean?
This flag can only be true of the application itself is marked as SSH enabled. When this flag is true, an SSH server will be started which allows the end-user direct access to the associated compute workload.
JobState
¶
A value describing the current state of a Job
enum class JobState {
IN_QUEUE,
RUNNING,
CANCELING,
SUCCESS,
FAILURE,
EXPIRED,
SUSPENDED,
}
Properties
IN_QUEUE
Any Job which is not yet ready
More specifically, this state should apply to any Job
for which all of the following holds:
The
Job
has been createdIt has never been in a final state
The number of
replicas
which are running is less than the requested amount
RUNNING
A Job where all the tasks are running
More specifically, this state should apply to any Job
for which all of the following holds:
All
replicas
of theJob
have been started
📝 NOTE: A Job
can be RUNNING
without actually being ready. For example, if a Job
exposes a web interface, then the web-interface doesn’t have to be available yet. That is, the server might
still be running its initialization code.
CANCELING
A Job which has been cancelled but has not yet terminated
📝 NOTE: This is only a temporary state. The Job
is expected to eventually transition to a final
state, typically the SUCCESS
state.
SUCCESS
A Job which has terminated without a _scheduler_ error
📝 NOTE: A Job
will complete successfully even if the user application exits with an unsuccessful
status code.
FAILURE
A Job which has terminated with a failure
📝 NOTE: A Job
should only fail if it is the scheduler’s fault
EXPIRED
A Job which has expired and was terminated as a result
This state should only be used if the timeAllocation
has expired. Any other
form of cancellation/termination should result in either SUCCESS
or FAILURE
.
SUSPENDED
A Job which might have previously run but is no longer running, this state is not final.
Unlike SUCCESS and FAILURE a Job can transition from this state to one of the active states again.
InteractiveSessionType
¶
A value describing a type of ‘interactive’ session
enum class InteractiveSessionType {
WEB,
VNC,
SHELL,
}
Properties
WEB
VNC
SHELL
JobIncludeFlags
¶
Flags used to tweak read operations of Jobs
data class JobIncludeFlags(
val filterApplication: String?,
val filterState: JobState?,
val includeParameters: Boolean?,
val includeApplication: Boolean?,
val includeProduct: Boolean?,
val includeOthers: Boolean?,
val includeUpdates: Boolean?,
val includeSupport: Boolean?,
val filterCreatedBy: String?,
val filterCreatedAfter: Long?,
val filterCreatedBefore: Long?,
val filterProvider: String?,
val filterProductId: String?,
val filterProductCategory: String?,
val filterProviderIds: String?,
val filterIds: String?,
val hideProductId: String?,
val hideProductCategory: String?,
val hideProvider: String?,
)
Properties
filterApplication
: String?
String?
filterState
: JobState?
JobState?
includeParameters
: Boolean?
Includes `specification.parameters` and `specification.resources`
Boolean?
includeApplication
: Boolean?
Includes `specification.resolvedApplication`
Boolean?
includeProduct
: Boolean?
Includes `specification.resolvedProduct`
Boolean?
includeOthers
: Boolean?
Boolean?
includeUpdates
: Boolean?
Boolean?
includeSupport
: Boolean?
Boolean?
filterCreatedBy
: String?
String?
filterCreatedAfter
: Long?
Long?
filterCreatedBefore
: Long?
Long?
filterProvider
: String?
String?
filterProductId
: String?
String?
filterProductCategory
: String?
String?
filterProviderIds
: String?
Filters by the provider ID. The value is comma-separated.
String?
filterIds
: String?
Filters by the resource ID. The value is comma-separated.
String?
hideProductId
: String?
String?
hideProductCategory
: String?
String?
hideProvider
: String?
String?
ComputeSupport
¶
data class ComputeSupport(
val product: ProductReference,
val docker: ComputeSupport.Docker?,
val virtualMachine: ComputeSupport.VirtualMachine?,
val native: ComputeSupport.Native?,
val maintenance: Maintenance?,
)
Properties
product
: ProductReference
ProductReference
docker
: ComputeSupport.Docker?
Support for `Tool`s using the `DOCKER` backend
ComputeSupport.Docker?
virtualMachine
: ComputeSupport.VirtualMachine?
Support for `Tool`s using the `VIRTUAL_MACHINE` backend
ComputeSupport.VirtualMachine?
native
: ComputeSupport.Native?
Support for `Tool`s using the `NATIVE` backend
ComputeSupport.Native?
maintenance
: Maintenance?
Maintenance?
ComputeSupport.Docker
¶
data class Docker(
val enabled: Boolean?,
val web: Boolean?,
val vnc: Boolean?,
val logs: Boolean?,
val terminal: Boolean?,
val peers: Boolean?,
val timeExtension: Boolean?,
val utilization: Boolean?,
)
Properties
enabled
: Boolean?
Flag to enable/disable this feature
Boolean?
All other flags are ignored if this is false
.
web
: Boolean?
Flag to enable/disable the interactive interface of `WEB` `Application`s
Boolean?
vnc
: Boolean?
Flag to enable/disable the interactive interface of `VNC` `Application`s
Boolean?
logs
: Boolean?
Flag to enable/disable the log API
Boolean?
terminal
: Boolean?
Flag to enable/disable the interactive terminal API
Boolean?
peers
: Boolean?
Flag to enable/disable connection between peering `Job`s
Boolean?
timeExtension
: Boolean?
Flag to enable/disable extension of jobs
Boolean?
utilization
: Boolean?
Flag to enable/disable the retrieveUtilization of jobs
Boolean?
ComputeSupport.Native
¶
data class Native(
val enabled: Boolean?,
val logs: Boolean?,
val vnc: Boolean?,
val terminal: Boolean?,
val timeExtension: Boolean?,
val utilization: Boolean?,
val web: Boolean?,
)
Properties
enabled
: Boolean?
Flag to enable/disable this feature
Boolean?
All other flags are ignored if this is false
.
logs
: Boolean?
Flag to enable/disable the log API
Boolean?
vnc
: Boolean?
Flag to enable/disable the VNC API
Boolean?
terminal
: Boolean?
Flag to enable/disable the interactive terminal API
Boolean?
timeExtension
: Boolean?
Flag to enable/disable extension of jobs
Boolean?
utilization
: Boolean?
Flag to enable/disable the retrieveUtilization of jobs
Boolean?
web
: Boolean?
Flag to enable/disable the interactive interface of `WEB` `Application`s
Boolean?
ComputeSupport.VirtualMachine
¶
data class VirtualMachine(
val enabled: Boolean?,
val logs: Boolean?,
val vnc: Boolean?,
val terminal: Boolean?,
val timeExtension: Boolean?,
val suspension: Boolean?,
val utilization: Boolean?,
)
Properties
enabled
: Boolean?
Flag to enable/disable this feature
Boolean?
All other flags are ignored if this is false
.
logs
: Boolean?
Flag to enable/disable the log API
Boolean?
vnc
: Boolean?
Flag to enable/disable the VNC API
Boolean?
terminal
: Boolean?
Flag to enable/disable the interactive terminal API
Boolean?
timeExtension
: Boolean?
Flag to enable/disable extension of jobs
Boolean?
suspension
: Boolean?
Flag to enable/disable suspension of jobs
Boolean?
utilization
: Boolean?
Flag to enable/disable the retrieveUtilization of jobs
Boolean?
CpuAndMemory
¶
data class CpuAndMemory(
val cpu: Double,
val memory: Long,
)
ExportedParameters
¶
data class ExportedParameters(
val siteVersion: Int,
val request: ExportedParametersRequest,
val resolvedResources: ExportedParameters.Resources?,
val machineType: JsonObject,
)
Properties
siteVersion
: Int
Int
request
: ExportedParametersRequest
ExportedParametersRequest
resolvedResources
: ExportedParameters.Resources?
ExportedParameters.Resources?
machineType
: JsonObject
JsonObject
ExportedParameters.Resources
¶
data class Resources(
val ingress: JsonObject?,
)
Properties
ingress
: JsonObject?
JsonObject?
JobBindKind
¶
enum class JobBindKind {
BIND,
UNBIND,
}
Properties
BIND
UNBIND
JobBinding
¶
data class JobBinding(
val kind: JobBindKind,
val job: String,
)
JobOutput
¶
data class JobOutput(
val outputFolder: String?,
)
Properties
outputFolder
: String?
String?
JobStatus
¶
Describes the current state of the Resource
data class JobStatus(
val state: JobState,
val jobParametersJson: ExportedParameters?,
val startedAt: Long?,
val expiresAt: Long?,
val resolvedApplication: Application?,
val resolvedSupport: ResolvedSupport<Product.Compute, ComputeSupport>?,
val resolvedProduct: Product.Compute?,
val allowRestart: Boolean?,
)
The contents of this field depends almost entirely on the specific Resource
that this field is managing. Typically,
this will contain information such as:
A state value. For example, a compute
Job
might beRUNNING
Key metrics about the resource.
Related resources. For example, certain
Resource
s are bound to anotherResource
in a mutually exclusive way, this should be listed in thestatus
section.
Properties
state
: JobState
The current of state of the `Job`.
JobState
This will match the latest state set in the updates
jobParametersJson
: ExportedParameters?
ExportedParameters?
startedAt
: Long?
Timestamp matching when the `Job` most recently transitioned to the `RUNNING` state.
Long?
For Job
s which suspend this might occur multiple times. This will always point to the latest pointin time it started running.
expiresAt
: Long?
Timestamp matching when the `Job` is set to expire.
Long?
This is generally equal to startedAt + timeAllocation
. Note that this field might be null
if the Job
has no associated deadline. For Job
s that suspend however, this is more likely to beequal to the initial RUNNING
state + timeAllocation
.
resolvedApplication
: Application?
The resolved application referenced by `application`.
Application?
This attribute is not included by default unless includeApplication
is specified.
resolvedSupport
: ResolvedSupport<Product.Compute, ComputeSupport>?
ResolvedSupport<Product.Compute, ComputeSupport>?
resolvedProduct
: Product.Compute?
The resolved product referenced by `product`.
Product.Compute?
This attribute is not included by default unless includeProduct
is specified.
allowRestart
: Boolean?
Boolean?
JobUpdate
¶
Describes an update to the Resource
data class JobUpdate(
val state: JobState?,
val outputFolder: String?,
val status: String?,
val expectedState: JobState?,
val expectedDifferentState: Boolean?,
val newTimeAllocation: Long?,
val allowRestart: Boolean?,
val newMounts: List<String>?,
val timestamp: Long?,
)
Updates can optionally be fetched for a Resource
. The updates describe how the Resource
changes state over time.
The current state of a Resource
can typically be read from its status
field. Thus, it is typically not needed to
use the full update history if you only wish to know the current state of a Resource
.
An update will typically contain information similar to the status
field, for example:
A state value. For example, a compute
Job
might beRUNNING
.Change in key metrics.
Bindings to related
Resource
s.
Properties
state
: JobState?
JobState?
outputFolder
: String?
String?
status
: String?
A generic text message describing the current status of the `Resource`
String?
expectedState
: JobState?
JobState?
expectedDifferentState
: Boolean?
Boolean?
newTimeAllocation
: Long?
Long?
allowRestart
: Boolean?
Boolean?
timestamp
: Long?
A timestamp referencing when UCloud received this update
Long?
JobsLog
¶
data class JobsLog(
val rank: Int,
val stdout: String?,
val stderr: String?,
)
OpenSession
¶
sealed class OpenSession {
abstract val domainOverride: String?
abstract val jobId: String
abstract val rank: Int
class Shell : OpenSession()
class Vnc : OpenSession()
class Web : OpenSession()
}
OpenSession.Shell
¶
data class Shell(
val jobId: String,
val rank: Int,
val sessionIdentifier: String,
val domainOverride: String?,
val type: String /* "shell" */,
)
OpenSession.Vnc
¶
data class Vnc(
val jobId: String,
val rank: Int,
val url: String,
val password: String?,
val domainOverride: String?,
val type: String /* "vnc" */,
)
OpenSession.Web
¶
data class Web(
val jobId: String,
val rank: Int,
val redirectClientTo: String,
val domainOverride: String?,
val type: String /* "web" */,
)
OpenSessionWithProvider
¶
data class OpenSessionWithProvider(
val providerDomain: String,
val providerId: String,
val session: OpenSession,
)
QueueStatus
¶
data class QueueStatus(
val running: Int,
val pending: Int,
)
ExportedParametersRequest
¶
data class ExportedParametersRequest(
val application: NameAndVersion,
val product: ProductReference,
val name: String?,
val replicas: Int,
val parameters: JsonObject,
val resources: List<JsonObject>,
val timeAllocation: SimpleDuration?,
val resolvedProduct: JsonObject?,
val resolvedApplication: JsonObject?,
val resolvedSupport: JsonObject?,
val allowDuplicateJob: Boolean?,
val sshEnabled: Boolean?,
)
Properties
application
: NameAndVersion
NameAndVersion
product
: ProductReference
ProductReference
name
: String?
String?
replicas
: Int
Int
parameters
: JsonObject
JsonObject
resources
: List<JsonObject>
List<JsonObject>
timeAllocation
: SimpleDuration?
SimpleDuration?
resolvedProduct
: JsonObject?
JsonObject?
resolvedApplication
: JsonObject?
JsonObject?
resolvedSupport
: JsonObject?
JsonObject?
allowDuplicateJob
: Boolean?
Boolean?
sshEnabled
: Boolean?
Boolean?
JobsExtendRequestItem
¶
data class JobsExtendRequestItem(
val jobId: String,
val requestedTime: SimpleDuration,
)
JobsOpenInteractiveSessionRequestItem
¶
data class JobsOpenInteractiveSessionRequestItem(
val id: String,
val rank: Int,
val sessionType: InteractiveSessionType,
)
JobsRetrieveUtilizationRequest
¶
data class JobsRetrieveUtilizationRequest(
val jobId: String,
)
Properties
jobId
: String
String
JobsFollowResponse
¶
data class JobsFollowResponse(
val updates: List<JobUpdate>?,
val log: List<JobsLog>?,
val newStatus: JobStatus?,
)
JobsRetrieveUtilizationResponse
¶
data class JobsRetrieveUtilizationResponse(
val capacity: CpuAndMemory,
val usedCapacity: CpuAndMemory,
val queueStatus: QueueStatus,
)
Properties
capacity
: CpuAndMemory
The total capacity of the entire compute system
CpuAndMemory
usedCapacity
: CpuAndMemory
The capacity currently in use, by running jobs, of the entire compute system
CpuAndMemory
queueStatus
: QueueStatus
The system of the queue
QueueStatus