UFile
¶
A UFile
is a resource for storing, retrieving and organizing data in UCloud
data class UFile(
val id: String,
val specification: UFileSpecification,
val createdAt: Long,
val status: UFileStatus,
val owner: ResourceOwner,
val permissions: ResourcePermissions?,
val updates: List<UFileUpdate>,
val providerGeneratedId: String?,
)
A file in UCloud (UFile
) closely follows the concept of a computer file you might already be familiar with.
The functionality of a file is mostly determined by its type
. The two most important
types are the DIRECTORY
and FILE
types. A
DIRECTORY
is a container of UFile
s. A directory can itself contain more
directories, which leads to a natural tree-like structure. FILE
s, also referred to as a
regular files, are data records which each contain a series of bytes.
All files in UCloud have a name associated with them. This name uniquely identifies them within their directory. All files in UCloud belong to exactly one directory.
File operations must be able to reference the files on which they operate. In UCloud, these references are made through
the id
property, also known as a path. Paths use the tree-like structure of files to reference a file, it does so by
declaring which directories to go through, starting at the top, to reach the file we are referencing. This information
is serialized as a textual string, where each step of the path is separated by forward-slash /
(U+002F
). The path
must start with a single forward-slash, which signifies the root of the file tree. UCloud never users ‘relative’ file
paths, which some systems use.
All files in UCloud additionally have metadata associated with them. For this we differentiate between system-level metadata and user-defined metadata.
We have just covered two examples of system-level metadata, the id
(path) and
type
. UCloud additionally supports metadata such as general
stats about the files, such as file sizes. All files have a set of
permissions
associated with them, providers may optionally expose this information to
UCloud and the users.
User-defined metadata describe the contents of a file. All metadata is described by a template
(FileMetadataTemplate
), this template defines a document structure for the metadata. User-defined metadata
can be used for a variety of purposes, such as: Datacite metadata, sensitivity levels,
and other field specific metadata formats.
Properties
id
: String
A unique reference to a file
String
All files in UCloud have a name
associated with them. This name uniquely identifies them within their directory. All
files in UCloud belong to exactly one directory. A name
can be any textual string, for example: thesis-42.docx
.
However, certain restrictions apply to file name
s, see below for a concrete list of rules and recommendations.
The extension
of a file is typically used as a hint to clients how to treat a specific file. For example, an extension
might indicate that the file contains a video of a specific format. In UCloud, the file’s extension
is derived from
its name
. In UCloud, it is simply defined as the text immediately following, and not including, the last
period .
(U+002E
). The table below shows some examples of how UCloud determines the extension of a file:
File name |
Derived extension |
Comment |
---|---|---|
thesis-42.docx |
docx |
- |
thesis-43-final.tar |
tar |
- |
thesis-43-FINAL2.tar.gz |
gz |
Note that UCloud does not recognize tar as being part of the extension |
thesis |
Empty string | |
.ssh |
ssh |
'Hidden' files also have a surprising extension in UCloud |
File operations must be able to reference the files on which they operate. In UCloud, these references are made through
the path
property. Paths use the tree-like structure of files to reference a file, it does so by declaring which
directories to go through, starting at the top, to reach the file we are referencing. This information is serialized as
a textual string, where each step of the path is separated by forward-slash /
(U+002F
). The path must start with a
single forward-slash, which signifies the root of the file tree. UCloud never users ‘relative’ file paths, which some
systems use.
A path in UCloud is structured in such a way that they are unique across all providers and file systems. The figure below shows how a UCloud path is structured, and how it can be mapped to an internal file-system path.
Figure: At the top, a UCloud path along with the components of it. At the bottom, an example of an internal, provider specific, file-system path.
The figure shows how a UCloud path consists of four components:
The ‘Provider ID’ references the provider who owns and hosts the file
The product reference, this references the product that is hosting the
FileCollection
The
FileCollection
ID references the ID of the internal file collection. These are controlled by the provider and match the different types of file-systems they have available. A single file collection typically maps to a specific folder on the provider’s file-system.The internal path, which tells the provider how to find the file within the collection. Providers can typically pass this as a one-to-one mapping.
Rules of a file name
:
The
name
cannot be equal to.
(commonly interpreted to mean the current directory)The
name
cannot be equal to..
(commonly interpreted to mean the parent directory)The
name
cannot contain a forward-slash/
(U+002F
)Names are strictly unicode
UCloud will normalize a path which contain .
or ..
in a path’s step. It is normalized according to the comments
mentioned in rule 1 and 2.
Note that all paths in unicode are strictly unicode (rule 4). This is different from the unix standard. Unix file names can contain arbitrary binary data. (TODO determine how providers should handle this edge-case)
Additionally regarding file name
s, UCloud recommends to users the following:
Avoid the following file names:
Containing Windows reserved characters:
<
,>
,:
,"
,/
,|
,?
,*
,\
Any of the reserved file names in Windows:
AUX
COM1
,COM2
,COM3
,COM4
,COM5
,COM6
,COM7
,COM8
,COM9
CON
LPT1
,LPT2
,LPT3
,LPT4
,LPT5
,LPT6
,LPT7
,LPT8
,LPT9
NUL
PRN
Any of the above followed by an extension
Avoid ASCII control characters (decimal value 0-31 both inclusive)
Avoid Unicode control characters (e.g. right-to-left override)
Avoid line breaks, paragraph separators and other unicode separators which is typically interpreted as a line-break
Avoid binary names
UCloud will attempt to reject these for file operations initiated through the client, but it cannot ensure that these files do not appear regardless. This is due to the fact that the file systems are typically mounted directly by user-controlled jobs.
Rules of a file path
:
All paths must be absolute, that is they must start with
/
UCloud will normalize all path ‘steps’ containing either
.
or..
Additionally UCloud recommends to users the following regarding path
s:
Avoid long paths:
Older versions of Unixes report
PATH_MAX
as 1024Newer versions of Unixes report
PATH_MAX
as 4096Older versions of Windows start failing above 256 characters
specification
: UFileSpecification
UFileSpecification
createdAt
: Long
Timestamp referencing when the request for creation was received by UCloud
Long
status
: UFileStatus
Holds the current status of the `Resource`
UFileStatus
owner
: ResourceOwner
Contains information about the original creator of the `Resource` along with project association
ResourceOwner
permissions
: ResourcePermissions?
Permissions assigned to this resource
ResourcePermissions?
A null value indicates that permissions are not supported by this resource type.
updates
: List<UFileUpdate>
List<UFileUpdate>
providerGeneratedId
: String?
String?