Managing types

This guide walks you through the process of creating a new data type in the Flowsint ecosystem and integrating it throughout the platform. Types in Flowsint serve as the foundation for all data modeling, providing structure, validation, and schema generation for the entire system.

Understanding the type system

The Flowsint type system is built on Pydantic models and lives in the flowsint-types package. Every type is a python class that inherits from FlowsintType, which itself inherits from pydantic.BaseModel. This provides automatic validation, serialization, JSON schema generation, and graph-specific functionality like automatic label generation. The architecture is deliberately simple with minimal inheritance hierarchies. Each type inherits from FlowsintType and defines its own fields and behavior.

The package structure is straightforward. Inside flowsint-types/src/flowsint_types/, you'll find individual python files for each type. Most types get their own file, though closely related types sometimes share a file. For example, wallet.py contains CryptoWallet, CryptoWalletTransaction, and CryptoNFT because they work together as a conceptual unit.

Currently, Flowsint includes 39 built-in types covering everything from network entities like domains and IPs to identity information like individuals and organizations, security data like credentials and breaches, and financial information like bank accounts and crypto wallets.

What is FlowsintType?

FlowsintType is the base class for all Flowsint entity types. It extends Pydantic's BaseModel with additional functionality specific to Flowsint's graph database and UI needs:

class FlowsintType(BaseModel):
    """Base class for all Flowsint entity types with label support.
    Label is optional but computed at definition time.
    """
    label: Optional[str] = Field(
        None,
        description="UI-readable label for this entity, the one used on the graph.",
        title="Label"
    )

The label field is automatically set by types using a @model_validator decorator, and this label is what appears on graph nodes in the Neo4j database and in the frontend UI. Every type should compute its own meaningful label based on its fields.

Creating a new type

Let's walk through the process of creating a new type from scratch. We'll use a hypothetical Vehicle type as our example.

Setting up the file

Start by creating a new python file in the types directory. The filename should be lowercase and match your type name in snake_case. For a Vehicle type, you would create vehicle.py:

cd flowsint-types/src/flowsint_types/
touch vehicle.py

Basic structure

Every type follows the same structural pattern. Here's what a basic type looks like:

from pydantic import Field, model_validator
from typing import Optional, Self
from .flowsint_base import FlowsintType
 
class Vehicle(FlowsintType):
    """Represents a vehicle with identifying information."""
 
    license_plate: str = Field(
        ...,
        description="Vehicle license plate number",
        title="License Plate",
        json_schema_extra={"primary": True},
    )
    brand: Optional[str] = Field(
        None,
        description="Vehicle manufacturer such as Toyota or Ford",
        title="Make"
    )
    model: Optional[str] = Field(
        None,
        description="Vehicle model name",
        title="Model"
    )
    year: Optional[int] = Field(
        None,
        description="Year of manufacture",
        title="Year"
    )
 
    @model_validator(mode='after')
    def compute_label(self) -> Self:
        """Compute a human-readable label for this vehicle."""
        if self.brand and self.model and self.year:
            self.label = f"{self.license_plate} ({self.brand} {self.model} {self.year})"
        else:
            self.label = self.license_plate
        return self

Let's break down the key components:

Inheritance and imports:

  • The class inherits from FlowsintType
  • Import FlowsintType from .flowsint_base
  • Import model_validator and Self from Pydantic for the label computation

Docstring:

  • Every type starts with a clear docstring explaining what it represents

Field definitions:

  • Each field is defined as a class attribute with type hints
  • Use Pydantic's Field() function to provide metadata
  • Required fields use the ellipsis (...) as their default value
  • Optional fields use Optional[Type] in their type hint and None as the default value
  • Always provide description (for API docs) and title (for UI labels)

Primary field:

  • The json_schema_extra={"primary": True} marks the unique identifier for this type
  • This field is used as the key when creating Neo4j nodes
  • Critical: Every type must have exactly one primary field
  • Choose a field that uniquely identifies instances of this type

Label computation:

  • The @model_validator(mode='after') decorator runs after all field validation
  • The method must be named compute_label and return Self
  • It sets self.label to a human-readable string that will appear in the UI and graph
  • Handle cases where optional fields might be None to avoid ugly labels
  • The label should help users quickly identify what this entity is

Naming conventions

Flowsint follows strict naming conventions to maintain consistency across the codebase. Class names use PascalCase (like Vehicle, SocialAccount, or CryptoWallet). Field names use snake_case (like license_plate, phone_number, or email_address). This matches python's standard conventions and makes the codebase more readable.

Understanding primary fields and labels

Two concepts are crucial for every Flowsint type: the primary field and the label. Understanding these will help you create types that work seamlessly with the graph database and UI.

The primary field

The primary field is the unique identifier for your type. It's marked with json_schema_extra={"primary": True} in the field definition:

username: str = Field(
    ...,
    description="Username or handle string",
    title="Username",
    json_schema_extra={"primary": True}
)

Why it matters:

  • When creating Neo4j nodes, this field is used as the key in MERGE operations
  • It ensures each entity is uniquely identified in the graph
  • The graph service extracts this field to determine node uniqueness

Rules for primary fields:

  • Every type must have exactly one primary field
  • The primary field should uniquely identify instances
  • It's typically a required field (using ... as default)
  • Common choices: IDs, usernames, emails, license plates, domain names

Examples of good primary fields:

  • Domain: domain field (e.g., "example.com")
  • Email: email field (e.g., "user@example.com")
  • Username: value field (e.g., "john_doe")
  • Ip: ip field (e.g., "192.168.1.1")
  • SocialAccount: id field (computed as "username@platform")

The label field and compute_label

The label is what users see in the UI and on graph nodes. It should be human-readable and help users quickly understand what an entity represents.

How it works:

  1. FlowsintType provides a label field (Optional[str])
  2. Your type defines a compute_label method to set this field
  3. The method runs automatically after validation using @model_validator(mode='after')

Basic pattern:

from pydantic import model_validator
from typing import Self
 
@model_validator(mode='after')
def compute_label(self) -> Self:
    """Compute a human-readable label."""
    self.label = f"@{self.value}"
    return self

Advanced patterns:

When you have optional fields, handle None values gracefully:

@model_validator(mode='after')
def compute_label(self) -> Self:
    """Compute label with optional display name."""
    if self.display_name:
        self.label = f"{self.display_name} (@{self.username.value})"
    else:
        self.label = f"@{self.username.value}"
    return self

For types with multiple identifiers, you might compute a composite ID:

@model_validator(mode='after')
def compute_label_and_id(self) -> Self:
    """Compute both ID and label."""
    # Compute unique ID from username and platform
    if self.username and self.platform:
        self.id = f"{self.username.value}@{self.platform}"
    elif self.username:
        self.id = self.username.value
 
    # Compute display label
    if self.display_name:
        self.label = f"{self.display_name} (@{self.username.value})"
    else:
        self.label = f"@{self.username.value}"
    return self

Best practices for labels:

  • Keep labels concise but informative
  • Include the most identifying information first
  • Handle None values for optional fields
  • Use parentheses or separators to structure complex labels
  • Think about what users need to see at a glance on the graph

Real-world examples

# Simple: just the value
# Username: "@john_doe"
self.label = f"@{self.value}"
 
# With context: show platform if available
# Username: "@john_doe (twitter)"
if self.platform:
    self.label = f"@{self.value} ({self.platform})"
else:
    self.label = f"@{self.value}"
 
# Rich: combine multiple fields
# Individual: "John Doe (john@example.com)"
if self.email:
    self.label = f"{self.full_name} ({self.email})"
else:
    self.label = self.full_name
 
# Complex: show key information
# Breach: "LinkedIn (2021) - 700M records"
self.label = f"{self.title} ({self.breachdate.split('-')[0]}) - {self.pwncount:,} records"

Working with different field types

Pydantic supports a wide range of field types beyond simple strings and integers. Here are the most common ones you'll use:

from pydantic import Field, HttpUrl, model_validator
from typing import Optional, List, Dict, Any, Self
from datetime import datetime
from .flowsint_base import FlowsintType
 
class ExampleType(FlowsintType):
    """Demonstrates various field types."""
 
    # Primary identifier
    id: str = Field(
        ...,
        description="Unique identifier",
        title="ID",
        json_schema_extra={"primary": True}
    )
 
    # Primitive types
    text_field: str = Field(..., description="A text string", title="Text")
    number_field: int = Field(..., description="An integer number", title="Number")
    decimal_field: float = Field(..., description="A decimal number", title="Decimal")
    boolean_field: bool = Field(..., description="True or false value", title="Boolean")
 
    # Optional fields
    optional_text: Optional[str] = Field(None, description="Optional text", title="Optional Text")
 
    # Collections - note the use of default_factory
    tags: List[str] = Field(
        default_factory=list,
        description="List of tag strings",
        title="Tags"
    )
 
    metadata: Dict[str, Any] = Field(
        default_factory=dict,
        description="Arbitrary metadata dictionary",
        title="Metadata"
    )
 
    # Special Pydantic types
    website: HttpUrl = Field(..., description="A validated URL", title="Website")
    timestamp: datetime = Field(..., description="Date and time", title="Timestamp")
 
    @model_validator(mode='after')
    def compute_label(self) -> Self:
        """Compute label for this example."""
        self.label = f"{self.id} - {self.text_field}"
        return self

When working with mutable types like lists and dictionaries, always use default_factory instead of providing a default value directly. Using default_factory=list is correct, while using default=[] would cause all instances to share the same list object, leading to subtle bugs.

Adding validation

Sometimes you need more sophisticated validation than just type checking. Pydantic lets you add custom validators using the field_validator decorator:

from pydantic import Field, field_validator
from typing import Optional, Any, Self
import ipaddress
from .flowsint_base import FlowsintType
 
class Ip(FlowsintType):
    """Represents an IP address with geolocation and ISP information."""
 
    address: str = Field(
        ...,
        description="IP address",
        title="IP Address",
        json_schema_extra={"primary": True},
    )
   ...
    @field_validator("address")
    @classmethod
    def validate_ip_address(cls, v: str) -> str:
        """Validate that the address is a valid IP address."""
        try:
            ipaddress.ip_address(v)
            return v
        except ValueError:
            raise ValueError(f"Invalid IP address: {v}")

Validators receive the field value and can either return a (potentially modified) value or raise a ValueError with an error message. This example validates the email format and normalizes it to lowercase. Note that @field_validator runs before @model_validator, so the email is validated and normalized before the label is computed.

Referencing other types

Types often need to reference other Flowsint types. You can import and use them just like any other python type:

from pydantic import Field, model_validator
from typing import Optional, Self
from .flowsint_base import FlowsintType
from .email import Email
from .phone import Phone
 
class Contact(FlowsintType):
    """Represents contact information for a person."""
 
    name: str = Field(
        ...,
        description="Contact name",
        title="Name",
        json_schema_extra={"primary": True}
    )
    email: Optional[Email] = Field(None, description="Email address", title="Email")
    phone: Optional[Phone] = Field(None, description="Phone number", title="Phone")
 
    @model_validator(mode='after')
    def compute_label(self) -> Self:
        """Compute label for this contact."""
        self.label = self.name
        return self

For types with circular references or complex relationships, you may need to call model_rebuild() at the end of your file:

from pydantic import Field, model_validator
from typing import Optional, Self
from .flowsint_base import FlowsintType
 
class CryptoWallet(FlowsintType):
    """Represents a cryptocurrency wallet."""
 
    address: str = Field(
        ...,
        description="Wallet address",
        title="Address",
        json_schema_extra={"primary": True}
    )
 
    @model_validator(mode='after')
    def compute_label(self) -> Self:
        """Compute label for this wallet."""
        self.label = self.address
        return self
 
class CryptoWalletTransaction(FlowsintType):
    """Represents a transaction between wallets."""
 
    transaction_id: str = Field(
        ...,
        description="Unique transaction ID",
        title="Transaction ID",
        json_schema_extra={"primary": True}
    )
    source: CryptoWallet = Field(..., description="Source wallet", title="Source")
    target: Optional[CryptoWallet] = Field(None, description="Target wallet", title="Target")
    amount: float = Field(..., description="Transaction amount", title="Amount")
 
    @model_validator(mode='after')
    def compute_label(self) -> Self:
        """Compute label for this transaction."""
        self.label = f"{self.amount} ({self.transaction_id[:8]}...)"
        return self
 
# Rebuild models to resolve forward references
CryptoWallet.model_rebuild()
CryptoWalletTransaction.model_rebuild()

Exporting your type

Once you've created your type, you need to export it from the package so other parts of Flowsint can use it.

Updating the package exports

Open flowsint-types/src/flowsint_types/__init__.py and add two things. First, import your new type at the top of the file with the other imports:

from .address import Location
from .affiliation import Affiliation
from .alias import Alias
# ... other imports ...
from .vehicle import Vehicle  # Add your import here

Second, add your type name to the __all__ list:

__all__ = [
    "Location",
    "Affiliation",
    "Alias",
    # ... other types ...
    "Vehicle",  # Add your type here
]

The __all__ list explicitly defines what gets exported when someone does from flowsint_types import *. While wildcard imports aren't always recommended, this ensures your type is properly exposed by the package.

Installing the package

After making these changes, you need to reinstall the package for them to take effect:

make prod
#or
cd flowsint-types
poetry install

This updates the package in your development environment so enrichers and the API can import your new type.

Integrating with the API

The final step is making your type available through the API so frontends can discover it and create instances.

Adding to the types route

Open flowsint-api/app/api/routes/types.py and import your new type with the others:

from flowsint_types import (
    Domain, Ip, Port, Email,
    # ... other imports ...
    Vehicle,  # Add your type here
)

Categorizing your type

The API organizes types into logical categories that appear in the frontend. In the get_types_list() function, you'll find a list of category dictionaries. You need to add your type to an appropriate category or create a new one.

Here's how you add to an existing category:

@router.get("/")
async def get_types_list(current_user: User = Depends(get_current_user)):
    types = [
        {
            "id": uuid4(),
            "type": "physical_assets",
            "key": "physical_assets_category",
            "label": "Physical Assets",
            "children": [
                extract_input_schema(Device, label_key="name"),
                extract_input_schema(Vehicle, label_key="license_plate"),
            ],
        },
        # ... other categories ...
    ]
    return types

The extract_input_schema() function takes your Pydantic model and converts it into a JSON schema that the frontend can use. The label_key parameter tells it which field to use as the primary label when displaying instances of this type.

If you're creating a new category, follow the same structure. Each category needs a unique type and key, a human-readable label, and a list of children containing the type schemas.

Available categories

Flowsint currently organizes types into these standard categories:

Global contains general-purpose types like Location and Phrase that don't fit neatly into other categories.

Identities & Entities includes Individual, Username, and Organization for representing people and groups.

Communication & Contact covers Phone, Email, SocialAccount, and Message for communication-related data.

Network encompasses all network-related types including ASN, CIDR, Domain, Website, Ip, Port, DNSRecord, SSLCertificate, and WebTracker.

Security & Access groups security-relevant types like Credential, Session, Device, Malware, and Weapon.

Files & Documents contains Document and File for representing digital files.

Financial Data includes BankAccount and CreditCard for financial information.

Leaks covers data breach information with Leak and Breach types.

Crypto contains cryptocurrency-related types including CryptoWallet, CryptoWalletTransaction, and CryptoNFT.

You can add your type to any of these categories or create a new category if none fit.

Complete examples

Let me show you some complete, real-world examples to illustrate different patterns.

Simple type example

The simplest types have just one or two required fields and minimal complexity:

from pydantic import Field, model_validator
from typing import Self
from .flowsint_base import FlowsintType
 
class Hashtag(FlowsintType):
    """Represents a social media hashtag."""
 
    tag: str = Field(
        ...,
        description="Hashtag text without the # symbol",
        title="Hashtag",
        json_schema_extra={"primary": True}
    )
 
    @model_validator(mode='after')
    def compute_label(self) -> Self:
        """Compute label for this hashtag."""
        self.label = f"#{self.tag}"
        return self

Type with validation

This example shows a Social Security Number type with format validation:

from pydantic import Field, field_validator, model_validator
from typing import Self
from .flowsint_base import FlowsintType
import re
 
class SocialSecurityNumber(FlowsintType):
    """Represents a US Social Security Number."""
 
    ssn: str = Field(
        ...,
        description="Social Security Number in format XXX-XX-XXXX",
        title="SSN",
        json_schema_extra={"primary": True}
    )
 
    @field_validator('ssn')
    @classmethod
    def validate_ssn_format(cls, v: str) -> str:
        """Validate SSN format and normalize to standard format."""
        clean = v.replace("-", "").replace(" ", "")
 
        if not re.match(r"^\d{9}$", clean):
            raise ValueError(
                "SSN must be exactly 9 digits (format: XXX-XX-XXXX or XXXXXXXXX)"
            )
 
        return f"{clean[:3]}-{clean[3:5]}-{clean[5:]}"
 
    @model_validator(mode='after')
    def compute_label(self) -> Self:
        """Compute label for this SSN."""
        # Mask most digits for privacy
        self.label = f"SSN ***-**-{self.ssn[-4:]}"
        return self

This example shows how types can reference other types to build rich data models:

from pydantic import Field, model_validator
from typing import Optional, Self
from .flowsint_base import FlowsintType
from .email import Email
 
class Whois(FlowsintType):
    """Represents WHOIS domain registration information."""
 
    domain: str = Field(
        ...,
        description="Domain name",
        title="Domain",
        json_schema_extra={"primary": True}
    )
 
    registrar: Optional[str] = Field(
        None,
        description="Name of the domain registrar",
        title="Registrar"
    )
 
    email: Optional[Email] = Field(
        None,
        description="Contact email address from WHOIS record",
        title="Contact Email"
    )
 
    creation_date: Optional[str] = Field(
        None,
        description="Date when the domain was first registered",
        title="Creation Date"
    )
 
    expiration_date: Optional[str] = Field(
        None,
        description="Date when the domain registration expires",
        title="Expiration Date"
    )
 
    @model_validator(mode='after')
    def compute_label(self) -> Self:
        """Compute label for this WHOIS record."""
        if self.registrar:
            self.label = f"{self.domain} (via {self.registrar})"
        else:
            self.label = f"WHOIS: {self.domain}"
        return self

Complex type with collections

This example demonstrates a type with lists of other types and rich metadata:

from pydantic import Field, model_validator
from typing import Optional, List, Dict, Any, Self
from .flowsint_base import FlowsintType
from .individual import Individual
from .location import Location
 
class Organization(FlowsintType):
    """Represents an organization with comprehensive business information."""
 
    name: str = Field(
        ...,
        description="Legal name of the organization",
        title="Organization Name",
        json_schema_extra={"primary": True}
    )
 
    registration_number: Optional[str] = Field(
        None,
        description="Official business registration number",
        title="Registration Number"
    )
 
    headquarters: Optional[Location] = Field(
        None,
        description="Primary headquarters location",
        title="Headquarters"
    )
 
    executives: List[Individual] = Field(
        default_factory=list,
        description="List of company executives and board members",
        title="Executives"
    )
 
    locations: List[Location] = Field(
        default_factory=list,
        description="All office and facility locations",
        title="Locations"
    )
 
    employee_count: Optional[int] = Field(
        None,
        description="Total number of employees",
        title="Employee Count"
    )
 
    revenue: Optional[float] = Field(
        None,
        description="Annual revenue in USD",
        title="Revenue"
    )
 
    industry: Optional[str] = Field(
        None,
        description="Primary industry sector",
        title="Industry"
    )
 
    metadata: Dict[str, Any] = Field(
        default_factory=dict,
        description="Additional metadata and custom fields",
        title="Metadata"
    )
 
    @model_validator(mode='after')
    def compute_label(self) -> Self:
        """Compute label for this organization."""
        if self.industry:
            self.label = f"{self.name} ({self.industry})"
        else:
            self.label = self.name
        return self

Best practices and common patterns

Documentation

Keep documentation at the forefront. Every type should have:

  • A clear docstring explaining what it represents
  • A descriptive description parameter for each field (for API docs)
  • A meaningful title parameter for each field (for UI labels)

Future developers (including yourself) will thank you for this clarity.

Required vs optional fields

Think carefully about what should be required versus optional:

  • Required fields (using ...): Only fields that uniquely identify an entity or are absolutely essential
  • Optional fields (using Optional[Type] and None): Most other fields should be optional since intelligence gathering is incremental and you rarely have complete information upfront

Always inherit from FlowsintType

Never inherit directly from Pydantic's BaseModel. Always use FlowsintType:

# ✅ Correct
from .flowsint_base import FlowsintType
 
class MyType(FlowsintType):
    ...
 
# ❌ Wrong
from pydantic import BaseModel
 
class MyType(BaseModel):
    ...

Always define a primary field

Every type must have exactly one primary field marked with json_schema_extra={"primary": True}:

id: str = Field(
    ...,
    description="Unique identifier",
    title="ID",
    json_schema_extra={"primary": True}  # ✅ Always include this
)

Choose a field that uniquely identifies instances of your type (IDs, usernames, emails, domain names, etc.).

Always implement compute_label

Every type must implement a compute_label method to set the label displayed in the UI and graph:

@model_validator(mode='after')
def compute_label(self) -> Self:
    """Compute a human-readable label."""
    # Handle None values gracefully
    if self.optional_field:
        self.label = f"{self.primary_field} ({self.optional_field})"
    else:
        self.label = self.primary_field
    return self

Best practices for labels:

  • Keep them concise but informative
  • Handle None values for optional fields gracefully
  • Put the most important information first
  • Think about what users need to see at a glance on the graph

Type hints and validation

Use type hints everywhere. They provide:

  • Automatic validation
  • Better IDE support and autocomplete
  • Inline documentation
  • Runtime type checking via Pydantic

For mutable default values like lists and dictionaries, always use default_factory:

# ✅ Correct
tags: List[str] = Field(default_factory=list)
metadata: Dict[str, Any] = Field(default_factory=dict)
 
# ❌ Wrong - all instances will share the same object!
tags: List[str] = Field(default=[])
metadata: Dict[str, Any] = Field(default={})

Importing other types

When referencing other Flowsint types, use relative imports to avoid circular import issues:

# ✅ Correct
from .email import Email
from .phone import Phone
 
# ❌ Avoid
from flowsint_types import Email, Phone  # Can cause circular imports

If you encounter circular import problems, you can use forward references (strings) in type hints and call model_rebuild() at the end of your module.

Custom validation

Consider adding custom validators for complex validation logic that goes beyond simple type checking:

@field_validator('email')
@classmethod
def validate_email(cls, v: str) -> str:
    """Validate and normalize email format."""
    if not is_valid_email(v):
        raise ValueError("Invalid email format")
    return v.lower()

This keeps validation logic close to the type definition and ensures data integrity throughout the system.

Order of execution

Remember the order in which Pydantic processes your type:

  1. Field validators (@field_validator) run first, validating and potentially transforming individual fields
  2. Model validators (@model_validator) run after, operating on the entire validated model
  3. Your compute_label method (a model validator) runs last, after all fields are validated

This means you can safely access validated field values in compute_label.

Testing your type

Writing tests for your types ensures they work correctly and helps catch bugs early. Create a test file in flowsint-types/tests/ that matches your type filename.

Basic test structure

# flowsint_types/tests/test_vehicle.py
from flowsint_types import Vehicle
import pytest
 
def test_vehicle_creation():
    """Test creating a vehicle with required fields."""
    vehicle = Vehicle(license_plate="ABC123")
    assert vehicle.license_plate == "ABC123"
 
def test_vehicle_with_optional_fields():
    """Test creating a vehicle with optional fields."""
    vehicle = Vehicle(
        license_plate="ABC123",
        brand="Toyota",
        model="Camry",
        year=2020
    )
    assert vehicle.brand == "Toyota"
    assert vehicle.year == 2020
 
def test_vehicle_missing_required_field():
    """Test that validation fails without required fields."""
    with pytest.raises(ValueError):
        Vehicle()  # Should fail - missing required field

Testing the primary field

Always verify that the primary field is correctly set:

def test_vehicle_primary_field():
    """Test that the primary field is correctly identified."""
    vehicle = Vehicle(license_plate="ABC123")
 
    # Check that license_plate is marked as primary
    field_info = Vehicle.model_fields["license_plate"]
    assert field_info.json_schema_extra.get("primary") is True

Testing label computation

The label is crucial for UI display, so test it thoroughly:

def test_vehicle_label_basic():
    """Test label computation with only required fields."""
    vehicle = Vehicle(license_plate="ABC123")
    assert vehicle.label == "ABC123"
 
def test_vehicle_label_with_details():
    """Test label computation with optional fields."""
    vehicle = Vehicle(
        license_plate="ABC123",
        brand="Toyota",
        model="Camry",
        year=2020
    )
    assert vehicle.label == "ABC123 (Toyota Camry 2020)"
 
def test_vehicle_label_partial_details():
    """Test label computation with some optional fields."""
    vehicle = Vehicle(
        license_plate="ABC123",
        brand="Toyota"
    )
    # Should handle None values gracefully
    assert vehicle.label == "ABC123"

Testing field validators

If your type has custom validators, test both valid and invalid inputs:

# tests/test_username.py
from flowsint_types import Username
import pytest
 
def test_username_valid():
    """Test valid username creation."""
    username = Username(value="john_doe")
    assert username.value == "john_doe"
    assert username.label == "john_doe"
 
def test_username_validation_too_short():
    """Test that usernames under 3 characters are rejected."""
    with pytest.raises(ValueError, match="Must be 3-80 characters"):
        Username(value="ab")
 
def test_username_validation_invalid_chars():
    """Test that invalid characters are rejected."""
    with pytest.raises(ValueError, match="only letters, numbers, underscores, and hyphens"):
        Username(value="john@doe")
 
def test_username_validation_boundaries():
    """Test boundary conditions."""
    # Minimum length
    username = Username(value="abc")
    assert username.value == "abc"
 
    # Maximum length
    long_name = "a" * 80
    username = Username(value=long_name)
    assert username.value == long_name
 
    # Too long
    with pytest.raises(ValueError):
        Username(value="a" * 81)

Testing types with nested objects

When your type contains other Flowsint types, test the relationships:

# tests/test_social_account.py
from flowsint_types import SocialAccount, Username
import pytest
 
def test_social_account_creation():
    """Test creating a social account with a username object."""
    username = Username(value="john_doe")
    account = SocialAccount(
        username=username,
        platform="twitter",
        profile_url="https://twitter.com/john_doe"
    )
 
    assert account.username.value == "john_doe"
    assert account.platform == "twitter"
    assert account.id == "john_doe@twitter"
 
def test_social_account_label_with_display_name():
    """Test label computation with display name."""
    username = Username(value="john_doe")
    account = SocialAccount(
        username=username,
        platform="twitter",
        display_name="John Doe"
    )
 
    assert account.label == "John Doe (@john_doe)"
 
def test_social_account_label_without_display_name():
    """Test label computation without display name."""
    username = Username(value="john_doe")
    account = SocialAccount(
        username=username,
        platform="twitter"
    )
 
    assert account.label == "@john_doe"

Testing serialization

Verify that your types serialize correctly to JSON:

def test_vehicle_serialization():
    """Test that vehicle serializes to JSON correctly."""
    vehicle = Vehicle(
        license_plate="ABC123",
        brand="Toyota",
        model="Camry",
        year=2020
    )
 
    # Convert to dict
    data = vehicle.model_dump()
    assert data["license_plate"] == "ABC123"
    assert data["brand"] == "Toyota"
    assert data["label"] == "ABC123 (Toyota Camry 2020)"
 
    # Convert to JSON string
    json_str = vehicle.model_dump_json()
    assert "ABC123" in json_str
 
def test_vehicle_deserialization():
    """Test creating vehicle from dictionary."""
    data = {
        "license_plate": "ABC123",
        "brand": "Toyota",
        "model": "Camry",
        "year": 2020
    }
 
    vehicle = Vehicle(**data)
    assert vehicle.license_plate == "ABC123"
    assert vehicle.label == "ABC123 (Toyota Camry 2020)"

Running the tests

To run your tests:

cd flowsint-types
poetry run pytest tests/test_vehicle.py -v
 
# Run all tests
poetry run pytest -v
 
# Run with coverage
poetry run pytest --cov=flowsint_types tests/

Best practices for testing

  • Test the happy path first: Basic creation with valid data
  • Test validation: Both valid and invalid inputs
  • Test edge cases: Empty strings, very long strings, boundary values
  • Test label computation: With and without optional fields
  • Test the primary field: Ensure it's correctly marked
  • Test serialization: To/from dict and JSON
  • Use descriptive test names: The test name should describe what it tests
  • Use pytest fixtures for complex setup that's reused across tests

Example with fixtures:

import pytest
from flowsint_types import Username, SocialAccount
 
@pytest.fixture
def sample_username():
    """Fixture providing a sample username."""
    return Username(value="john_doe")
 
@pytest.fixture
def sample_account(sample_username):
    """Fixture providing a sample social account."""
    return SocialAccount(
        username=sample_username,
        platform="twitter",
        profile_url="https://twitter.com/john_doe"
    )
 
def test_with_fixtures(sample_account):
    """Test using fixtures."""
    assert sample_account.username.value == "john_doe"
    assert sample_account.platform == "twitter"

Troubleshooting common issues

Import errors

If you encounter import errors after creating your type, make sure you've run poetry install in the flowsint-types directory. The package needs to be reinstalled for changes to take effect:

cd flowsint-types
poetry install

Type not appearing in the API

If your type doesn't appear in the API, verify that you've:

  1. Imported it in flowsint_types/__init__.py
  2. Added it to the __all__ list in flowsint_types/__init__.py
  3. Imported it in flowsint-api/app/api/routes/types.py
  4. Added it to the appropriate category in the get_types_list() function

Validation errors

For validation errors, check that you're using:

  • The ellipsis (...) for required fields
  • None for optional fields
  • Optional[Type] in type hints for optional fields

Nodes not appearing in the graph

If your type's instances aren't appearing in Neo4j:

  • Missing primary field: Ensure exactly one field is marked with json_schema_extra={"primary": True}
  • Primary field not accessible: If the primary field is a nested object, create a computed string field as the primary instead
  • Check the enricher: Verify that enrichers using this type call self.create_node(instance)

Label not appearing correctly

If labels aren't displaying correctly in the UI or graph:

  • Missing compute_label: Ensure you've implemented the @model_validator(mode='after') method
  • Not returning Self: The method must return self
  • None handling: Check that you handle None values for optional fields gracefully
  • Method name: The method must be named compute_label exactly

Circular imports

If you're seeing issues with circular imports:

  • Use relative imports (from .email import Email) instead of absolute imports
  • Use forward references (string type hints) if needed
  • Call model_rebuild() at the end of your module to resolve forward references

Enricher errors with your type

If enrichers fail when using your type:

  • Validation failures: Your field validators might be too strict; check validator error messages in logs
  • Nested object issues: When passing nested Flowsint types, pass the complete object, don't recreate it
  • Primary key extraction: The graph service needs to extract a primitive value from your primary field

Next steps

Once you've created and registered your type, you can use it in enrichers to build intelligence gathering workflows. Types serve as the input and output specifications for enrichers, and they define the structure of nodes in the Neo4j graph database.

Key checklist for new types

Before considering your type complete, verify that you've:

  • Inherited from FlowsintType
  • Marked exactly one field as primary with json_schema_extra={"primary": True}
  • Implemented compute_label method that handles None values gracefully
  • Provided description and title for all fields
  • Used default_factory for list and dict fields
  • Written tests for creation, validation, primary field, and label computation
  • Exported your type in flowsint_types/__init__.py
  • Added it to the API routes in flowsint-api/app/api/routes/types.py
  • Run poetry install to make the type available

Exploring further

You might also want to explore:

  • Creating enrichers: Use your type as input/output in custom enrichers
  • Custom types via API: Flowsint supports runtime type creation using JSON Schema (see flowsint-core/src/flowsint_core/core/models.py)
  • Graph relationships: Learn how types are connected in the Neo4j graph database
  • Type schemas: Understand how Pydantic schemas are used for API validation

Final thoughts

Remember that types are the foundation of everything in Flowsint:

  • Well-designed types make enrichers easier to write
  • Clear primary fields ensure proper node identification in the graph
  • Meaningful labels make the UI and graph database more intuitive
  • Thorough validation ensures data integrity throughout the platform

With these concepts mastered, you're ready to create powerful, robust types that will make the entire Flowsint platform more effective for intelligence gathering.

Need troubleshooting or spotted a bug ? Feel free to submit an issue here.