API

IncCSV.IncFileType
IncFile(metadata, table; csv_start=1)

Container returned by readinc.

An IncFile stores the parsed metadata, the table returned by CSV.jl, and the line number where the CSV component starts. Use metadata and table to access the two main pieces.

Fields

  • metadata: dictionary of top-level metadata values and one-level sections.
  • table: the parsed CSV table, such as a CSV.File or DataFrame.
  • csv_start: 1-based line number of the CSV header or first CSV row.
IncCSV.IncSchemaType
IncSchema(fields; allow_extra=true)

A lightweight schema for INC metadata.

Schemas are read from ordinary INC metadata blocks using readschema. Each schema field has a path, a requirement level (:must, :must_not, or :optional), a type descriptor string, and optional descriptive text. Type descriptors are recorded for humans and downstream tools; IncCSV does not parse metadata values beyond its normal Int/String inference.

By default, metadata fields not described by the schema are allowed. If allow_extra is false, validation fails when a file includes metadata outside the schema's MUST, MUST_NOT, and OPTIONAL fields.

IncCSV.IncSummaryType
IncSummary(source, title, rows, columns, metadata_fields, csv_start)

Small summary returned by summarise.

The summary records the source path when available, a title if the metadata has one, row and column information from the CSV component, metadata field paths, and the line where CSV data starts. Display an IncSummary or pass it to printsummary for a compact human-readable report.

IncCSV.SchemaValidationType
SchemaValidation(valid, missing, extra, forbidden)

Validation report returned by validateschema.

  • valid is true when all MUST schema fields are present.
  • missing lists required field paths that were not found.
  • extra lists metadata field paths found in the file but not described by the schema.
  • forbidden lists MUST_NOT field paths that were found.

Extra fields do not make a file invalid unless the schema has allow_extra=false. Forbidden fields always make a file invalid.

IncCSV.readincFunction
readinc(path; csvkwargs...)
readinc(path, sink; csvkwargs...)

Read an INC file and return an IncFile.

The metadata component is parsed by IncCSV. The CSV component is read by CSV.jl. Without sink, the table is a CSV.File. With sink, IncCSV delegates to CSV.read; for example readinc(path, DataFrame) returns an IncFile whose table is a DataFrame.

Plain CSV files are accepted and returned with empty metadata.

CSV.jl keyword options are forwarded to the CSV reader. By default, IncCSV sets the CSV header line to the first line after the closing metadata delimiter. CSV.jl options apply to the CSV component, not to the metadata preamble. If you pass header or skipto yourself, your explicit CSV.jl options are used.

INC files can also include a [structure] metadata section to provide portable reader options for the CSV component. Supported keys are delim, delimiter, quotechar, escapechar, comment, header, and footerskip. delimiter is an alias for delim and takes precedence if both are present. Explicit keyword arguments passed to readinc override [structure] values.

Examples

file = readinc("example.inc")
metadata(file)["title"]
table(file)
using DataFrames

file = readinc("example.inc", DataFrame; comment="#")
table(file) isa DataFrame
IncCSV.readschemaFunction
readschema(path)

Read a lightweight metadata schema from an INC-style metadata block.

The schema file reuses INC metadata syntax and normally contains these sections:

---
[MUST]
title = String
columns.score = String

[OPTIONAL]
version = Int

[MUST_NOT]
password = String

[description]
title = Human-readable title
columns.score = Units or meaning of the score column
password = Secrets must not be stored in data files
---

The schema keywords follow the requirement language of IETF RFC 2119: MUST, MUST_NOT, and OPTIONAL. IncCSV writes MUST_NOT with an underscore so that the keyword is a valid INC section name. For reading, REQUIRED and SHALL are accepted as aliases for MUST, SHALL_NOT is accepted as an alias for MUST_NOT, and MAY is accepted as an alias for OPTIONAL.

The optional [schema] section can set schema-level behavior:

[schema]
allow_extra = false

allow_extra defaults to true. When it is false, files containing metadata fields outside [MUST], [MUST_NOT], and [OPTIONAL] fail validation.

Keys in [MUST], [MUST_NOT], [OPTIONAL], and their read-only aliases are metadata field paths. Top-level metadata uses plain names such as title; section entries use one-level dotted paths such as columns.score; a section itself can be described with a path such as columns and a descriptor such as section. Deeper paths such as a.b.c are rejected.

Values in [MUST], [MUST_NOT], [OPTIONAL], and their read-only aliases are type descriptor strings. They may be more specific than Int or String; IncCSV records them but does not parse strings according to those descriptors.

IncCSV.summariseFunction
summarise(file::IncFile; source=nothing)
summarise(path; csvkwargs...)
summarise(path, sink; csvkwargs...)

Return an IncSummary for an INC file.

The summary is intentionally shallow: it reports the source path when known, the title metadata value when present, row and column counts from the CSV component, metadata field paths, and the line where CSV data starts. The table is still parsed by CSV.jl through readinc, so [structure] metadata and explicit CSV keyword arguments are handled in the usual way.

Example

summary = summarise("example.inc", DataFrame)
printsummary(summary)
IncCSV.printsummaryFunction
printsummary([io], summary::IncSummary)
printsummary([io], file::IncFile; source=nothing)
printsummary([io], path, [sink]; csvkwargs...)

Pretty print a compact summary of an INC file.

When passed a path, printsummary reads the file with readinc and then prints the resulting summary. It returns the summary object, so it can be used both for display and for programmatic checks.

IncCSV.validateschemaFunction
validateschema(metadata, schema::IncSchema)
validateschema(file::IncFile, schema::IncSchema)
validateschema(path, schema::IncSchema)

Validate INC metadata against a lightweight schema.

Validation checks that every [MUST] field in the schema is present and that no [MUST_NOT] field is present. [OPTIONAL] fields are documented but optional. Additional metadata fields are returned in SchemaValidation.extra. They are allowed by default, but make validation fail when schema.allow_extra == false.

Type descriptors are not enforced by IncCSV. They are carried by the schema for humans and downstream tools that may want richer parsing.

IncCSV.writeincFunction
writeinc(path, rows; metadata=Dict(), csvkwargs...)

Write an INC file and return path.

metadata is written as a small INI-style block delimited by --- lines. The CSV component is written by CSV.write, so any Tables.jl-compatible input can be used, including a DataFrame.

Metadata values must be Int or String. Metadata sections must be nonempty one-level dictionaries whose values are also Int or String. Keys and section names must follow the same naming rules accepted by the reader. Strings containing literal newlines are rejected because metadata is line oriented.

String values that look like integers are quoted so they roundtrip as strings. String values containing ", \, comment markers, brackets, or = are also quoted and escaped so they roundtrip safely. Integer values are written unquoted and are read back as Int.

CSV.jl keyword options are forwarded to CSV.write.

Example

rows = [(time=0, temperature=21), (time=1, temperature=22)]

writeinc(
    "example.inc",
    rows;
    metadata=Dict(
        "title" => "Example data",
        "version" => 1,
        "columns" => Dict("temperature" => "Celsius"),
    ),
)
DataAPI.metadataFunction
metadata(file::IncFile)

Return the metadata dictionary parsed from an INC file.

Top-level metadata keys map to strings or integers. Section metadata maps to a nested dictionary. Unquoted signed integer values are parsed as Int; quoted values and all other values are returned as String. The parsed metadata type is Dict{String,Union{Int,String,Dict{String,Union{Int,String}}}}.

Example

file = readinc("example.inc")
metadata(file)["title"]
metadata(file)["columns"]["temperature"]
IncCSV.tableFunction
table(file::IncFile)

Return the CSV table parsed from an INC file.

For readinc(path) this is a CSV.File. For readinc(path, sink), this is the table produced by CSV.read(path, sink; ...), such as a DataFrame.

Example

file = readinc("example.inc", DataFrame)
table(file)