Building an Attack Surface Recon Script with Criminal IP’s AITEM API

AI Usage (77%)

I did not start from a blank “how to call an API” template. I started from a public report published on June 14, 2026 that says Criminal IP introduced AITEM at Infosecurity Europe 2026 and positioned it as the next step in attack surface management. That is enough context to make one useful point: the hard part is not getting JSON back from a vendor. The hard part is turning that data into an inventory you can defend, with scope, confidence, and change tracking baked in from the start.

That is the position I take in this walkthrough: an attack surface recon script should be opinionated about scope before it gets clever about enrichment. If you reverse that order, you get a very fast noise generator.

Why I started with this report instead of a blank API tutorial

Most API tutorials begin with authentication and a sample response. For attack surface management, that is the wrong first step. The first question is not “can I query the service?” It is “what am I allowed to query, and how will I prove the results belong to my environment?”

The news item about AITEM gives a narrow but useful anchor. It confirms that Criminal IP used Infosecurity Europe 2026 to introduce AITEM as an attack surface management product. It does not give a documented schema, endpoint list, or rate-limit policy. I am not going to pretend it does.

So instead of guessing at undocumented routes, I am using the report as a trigger for a defensive pattern:

define the scope in machine-readable form
query only approved assets
store raw results separately from normalized inventory
score findings with confidence, not just severity
diff every run against the previous one

That pattern is vendor-agnostic. It is also the only way I trust an attack surface API in production.

What the Infosecurity Europe 2026 report actually confirms about AITEM

Event context and timeline

The confirmed facts from the public report are straightforward:

the article was published on June 14, 2026
it says Criminal IP introduced AITEM at Infosecurity Europe 2026
it describes AITEM as part of attack surface management

That is enough to establish the timing and the vendor’s positioning.

What it does not confirm matters just as much:

no public request format in the source
no public endpoint contract in the source
no public sample payload in the source
no claim in the source that the API should be used for broad internet scanning

That last point matters. Attack surface management and general-purpose recon sound similar, but they are not the same thing. A defensive ASM workflow should be anchored to owned domains, owned IP ranges, and approved enrichment. Anything broader starts drifting toward intelligence collection instead of inventory.

Confirmed facts versus inference

Item	Status	Why it matters
Criminal IP introduced AITEM at Infosecurity Europe 2026	Confirmed	Establishes the product context and date
The product is positioned around attack surface management	Confirmed	Tells us the intended defensive use case
AITEM exposes an API suitable for scripting	Inference	Plausible, but not proven by the source alone
The API can return host, port, or exposure data	Inference	Likely for ASM, but not stated in the source
The script should be scope-first and inventory-first	My recommendation	This is the safest operational model
The API should be treated as authoritative without normalization	False assumption	Real ASM data is usually stale, duplicated, and partial

My technical opinion is blunt: if the data model does not preserve confidence and provenance, you do not have an attack surface management pipeline. You have a feed.

Set a defensive scope before you write a single request

Owned domains, IP ranges, and approval boundaries

Before I touch the API, I want a scope file that answers three questions:

What do we own?
Who approved the run?
What do we explicitly exclude?

A good scope file is boring, and that is the point.

## scope.yaml
approvedBy: secops
ticket: IR-1842
roots:
  domains:
    - example.com
    - example.net
  cidrs:
    - 203.0.113.0/24
    - 198.51.100.0/24
exclude:
  domains:
    - status.example.com
  cidrs:
    - 203.0.113.200/32
notes: "Production assets only; no third-party hosted demos."

That file becomes the first input to the script, not a comment in a README. I would also keep it in version control, because every scope change should be reviewable.

What data to collect and what to leave out

Attack surface recon gets messy when it starts hoarding everything it can see. I prefer a narrow inventory that is actually useful for triage.

Collect	Leave out
Hostname	Passwords, tokens, API keys
IP address	Private response bodies
Port and protocol	Session cookies
TLS issuer and subject	Personal data
Banner or service fingerprint	Full HTML pages unless explicitly approved
First seen / last seen	Destructive payloads
Source and confidence	Anything outside approval boundaries

The mistake is thinking “more data” means “better recon.” In practice, more data usually means more false positives, more storage, and a higher chance you cross a boundary you should not.

A practical architecture for attack surface recon

Discovery, enrichment, and prioritization as separate stages

I split the script into three stages:

Discovery: pull candidate assets from the API.
Enrichment: attach metadata that helps a human understand the asset.
Prioritization: rank what deserves review first.

That split keeps the system honest. Discovery can be noisy. Enrichment can be expensive. Prioritization can be subjective. If you mix them together, you cannot tell which part introduced the problem.

A clean pipeline looks like this:

approved scope -> discovery -> raw results -> normalization -> enrichment -> scoring -> review queue

Each arrow should be a real file or table, not just a function call. That makes retries, diffing, and auditing much easier.

Where the script should store state

I would not keep state in a single JSON blob once the script grows past a proof of concept. Use SQLite or another small embedded database instead.

A minimal schema is enough:

runs: one row per execution
assets: canonical inventory of hostnames, IPs, services, URLs
findings: derived exposures and risk notes
raw_pages: optional cache of vendor responses for debugging

Why SQLite? Because you want cheap joins, easy diffs, and a durable history of what changed between runs. A flat file is fine for a demo. It becomes fragile the first time you need to answer “when did this port appear?” or “was this result already stale last week?”

Why a single pass is not enough

A single scan is a snapshot, and snapshots lie by omission.

Attack surfaces change for ordinary reasons:

autoscaling brings hosts online and offline
certificates rotate
CDN frontends mask origin changes
temporary staging hosts get forgotten
vendor feeds lag behind reality

That means one pass can tell you what the API saw once. It cannot tell you what is persistent, what is stale, or what is newly risky. You need at least two runs before you trust a trend.

Building the JavaScript API client

Auth, environment variables, and header handling

I like a tiny client with explicit configuration and no hidden magic. Keep secrets in environment variables, redact them in logs, and make the base URL configurable so you can point the same code at a mock server during tests.

// aitem-client.mjs


export function createClient({
  baseUrl,
  apiKey,
  timeoutMs = 10000,
  userAgent = "aitem-recon/1.0",
}) {
  if (!baseUrl) throw new Error("Missing baseUrl");
  if (!apiKey) throw new Error("Missing apiKey");

  async function requestJson(path, { method = "GET", body, cursor } = {}) {
    const url = new URL(path, baseUrl);
    if (cursor) url.searchParams.set("cursor", cursor);

    for (let attempt = 1; attempt <= 5; attempt++) {
      const controller = new AbortController();
      const timer = setTimeout(() => controller.abort(), timeoutMs);

      try {
        const res = await fetch(url, {
          method,
          headers: {
            Authorization: `Bearer ${apiKey}`,
            "Content-Type": "application/json",
            "User-Agent": userAgent,
          },
          body: body ? JSON.stringify(body) : undefined,
          signal: controller.signal,
        });

        if (res.status === 429) {
          const retryAfter = Number(res.headers.get("retry-after") || "1");
          await sleep(Math.min(retryAfter * 1000, 30000));
          continue;
        }

        if (!res.ok) {
          const text = await res.text().catch(() => "");
          throw new Error(`HTTP ${res.status}: ${text.slice(0, 200)}`);
        }

        return await res.json();
      } catch (err) {
        if (attempt === 5) throw err;
        await sleep(500 * attempt);
      } finally {
        clearTimeout(timer);
      }
    }
  }

  return { requestJson };
}

The important part is not the syntax. It is the behavior:

time out hung requests
respect 429 and Retry-After
retry with backoff
avoid leaking the API key in error output

Pagination, retries, and rate limiting

Attack surface feeds are usually paginated. Treat pagination as a state machine, not as a loop that assumes every page exists.

A typical pattern is:

fetch page one
save the cursor
write each record immediately
continue until the cursor disappears
mark the run partial if any page fails after retries

That last step matters. If page seven fails and you silently exit, your inventory will look complete and be wrong.

A safe retry policy is usually enough for a script:

retry on network errors
retry on 429
retry on 5xx
do not retry on 4xx other than 429
stop after a small number of attempts and mark the run partial

Safe error handling and exit codes

I use exit codes to distinguish bad configuration from upstream failure:

0 = success
2 = invalid configuration or scope
3 = partial data collected
4 = upstream API failure after retries

That lets automation react correctly. A CI job that sees 3 should alert a human, not assume the whole system is broken. A job that sees 2 should fail fast and tell you the config needs fixing.

Also, keep stderr output short. In a security tool, noisy error dumps are a liability if they include URLs, headers, or raw payloads.

Normalizing assets into one inventory

Canonical keys for hosts, IPs, services, and URLs

Raw recon data is often duplicated in slightly different forms:

EXAMPLE.com
example.com.
https://example.com:443
example.com:443
93.184.216.34

Normalize them before you compare anything.

I like canonical keys such as:

host:example.com
ip:203.0.113.10
service:203.0.113.10:443/tcp
url:https://example.com/

That gives each asset type one stable identity. It also makes dedupe logic much simpler.

A small helper goes a long way:

export function canonicalHost(host) {
  return host.trim().toLowerCase().replace(/\.$/, "");
}

export function canonicalUrl(raw) {
  const url = new URL(raw);
  url.hostname = url.hostname.toLowerCase();
  if ((url.protocol === "https:" && url.port === "443") || (url.protocol === "http:" && url.port === "80")) {
    url.port = "";
  }
  url.hash = "";
  return url.toString();
}

Deduping overlapping and stale records

Do not overwrite records blindly. Merge them.

I keep these fields for each asset:

first_seen
last_seen
sources
confidence
status

If the same asset appears from multiple pages or multiple runs, merge it and preserve provenance. If it disappears for one run, do not delete it immediately. Mark it as missing only after a threshold, such as two or three consecutive misses.

That policy saves you from false deletions when the API is stale or partially degraded.

Discovery pass: turning API results into an asset list

Query patterns that keep the scope tight

I would query by approved roots only:

exact domains from the scope file
exact CIDRs from the scope file
explicit subdomains only when the provider supports scoped expansion
no wildcards that cross organizational boundaries

The point is to make every result traceable to a root you approved.

If the API supports search by keyword or certificate fingerprint, I would still constrain the query to owned domains. Broad search terms are how you end up collecting other people’s infrastructure.

Filtering out noise before enrichment

Not every result is worth enriching.

I usually filter out:

known parking domains
CDN hostnames that are clearly shared and not owned
wildcard DNS noise
management endpoints that belong to a third party
duplicate assets with no new metadata

That filter does not need to be perfect. It needs to reduce obvious waste before you spend extra requests on enrichment.

A practical rule: if the discovery result does not match your approved scope and it does not have enough provenance to explain why it is adjacent, drop it.

Enrichment pass: making raw exposure data usable

Ports, banners, fingerprints, and service metadata

This is where the API data becomes useful to a human.

The most helpful enrichment fields are usually:

open port
protocol
TLS subject and issuer
banner string
service family
last observed timestamp
source confidence

Here is the key distinction I keep visible:

confirmed: the API saw port 443 open, or it returned a concrete certificate subject
inferred: a banner looks like nginx, or a fingerprint suggests a framework version

Those are not the same thing. A banner-derived version string can be wrong. A port that was observed open is much harder to dispute.

Optional external enrichment and how to isolate it

If I add DNS, ASN, certificate transparency, or whois lookups, I isolate them behind a feature flag and a separate worker.

Why? Because external enrichment can accidentally cross the line from “verify my own asset” into “collect more about adjacent infrastructure.”

A safer pattern is:

keep external enrichment off by default
restrict it to approved roots
cache results locally
tag every field with source and timestamp
time out aggressively

This is where a lot of teams overreach. I would rather have a smaller verified inventory than a giant enrichment graph full of guesses.

Risk scoring that does not exaggerate

A simple rubric for exposure, criticality, and confidence

I do not want a score that looks precise but hides the assumptions. I want a score that is simple enough to explain in a ticket.

A usable rubric might look like this:

Factor	Range	Example
Exposure	0-5	Internet-facing admin port gets 5
Asset criticality	0-5	Production auth service gets 5
Confidence	0-3	Confirmed observation gets 3
Urgency modifier	0-2	Active change or repeated sightings

Then compute a priority value from those inputs. The formula matters less than the discipline of separating observation from judgment.

Separating confirmed risk from inferred risk

I keep two labels for every finding:

confirmed_risk
inferred_risk

Examples:

Confirmed risk: an internet-facing admin port on an approved production host
Inferred risk: a banner suggests an outdated stack
Confirmed risk: a TLS certificate subject does not match the expected hostname
Inferred risk: the service “probably” belongs to a forgotten staging environment

You should never let inferred data outrank confirmed exposure without a human review step.

Ranking which findings deserve human review first

If I only have time for a few items, I start with:

exposed admin interfaces
production hosts that appeared outside the allowlist
high-value services with new ports
assets that disappeared and reappeared
low-confidence fingerprints

That order is deliberate. It favors evidence that is both actionable and likely to represent real exposure.

Reproducible commands and sample output

Example CLI invocation

A simple CLI can make the pipeline repeatable:

AITEM_BASE_URL="https://api.example.invalid" \
AITEM_API_KEY="redacted-in-shell-history" \
node recon.mjs \
  --scope scope.yaml \
  --out inventory.json \
  --csv inventory.csv \
  --dry-run

For a real test, I would run the script against a mock response first. That lets you validate pagination, retry logic, and normalization without depending on the vendor.

Example JSON output and what to look for

This is the shape of output I want from a dry run:

{
  "run_id": "2026-06-14T18:30:00Z",
  "scope": {
    "roots": ["example.com", "example.net"]
  },
  "summary": {
    "discovered": 42,
    "in_scope": 31,
    "enriched": 24,
    "high_priority": 3,
    "partial": false
  },
  "assets": [
    {
      "key": "host:api.example.com",
      "type": "host",
      "source": "aitem",
      "confidence": "confirmed",
      "first_seen": "2026-06-14T18:30:00Z",
      "last_seen": "2026-06-14T18:30:00Z",
      "signals": [
        { "name": "port", "value": 443, "confidence": "confirmed" },
        { "name": "tls_issuer", "value": "Example Issuing CA", "confidence": "confirmed" },
        { "name": "service_fingerprint", "value": "nginx", "confidence": "inferred" }
      ]
    }
  ]
}

What I look for first:

does every asset have a canonical key?
is scope preserved in the output?
are inferred fields labeled as inferred?
does the summary show whether the run was partial?

If those answers are messy, the pipeline is not ready for analysts.

Example CSV export for analysts

CSV is still useful because analysts like spreadsheets.

asset_key,asset_type,signal,severity,confidence,first_seen,last_seen,source
host:api.example.com,host,open-443,high,confirmed,2026-06-14T18:30:00Z,2026-06-14T18:30:00Z,aitem
host:api.example.com,host,tls-mismatch,medium,confirmed,2026-06-14T18:30:00Z,2026-06-14T18:30:00Z,aitem
host:staging.example.net,host,unknown-banner,low,inferred,2026-06-14T18:30:00Z,2026-06-14T18:30:00Z,aitem

The analyst view should be flatter than the internal inventory. It needs enough context to triage quickly, not every raw field the API returned.

Failure modes, false positives, and stale intelligence

Missing coverage and delayed updates

Attack surface feeds are not the truth. They are one observer’s view of your environment.

Common failure modes include:

a host exists but is not indexed yet
a host was removed but still appears in the feed
a service changed ports between collection cycles
a certificate rotated and the fingerprint changed
a shared platform made your asset look like someone else’s

That is why last-seen timestamps and repeated runs matter. A single high-risk record can be stale and still look urgent.

Partial API failures and backoff strategy

If the API rate-limits you or drops pages, do not paper over it.

What I do instead:

back off on 429
respect Retry-After
cache successful pages immediately
mark the run partial when pagination is incomplete
keep the previous good inventory until the next full run succeeds

That gives you a stable baseline and makes outages visible rather than silent.

When to distrust a high-risk score

I distrust a high score when:

the fingerprint is low confidence
the asset sits on shared cloud or CDN infrastructure
the only signal is a banner string
the data is old enough that the last_seen date is questionable
the asset never appeared in two independent discovery passes

A high score is a triage hint, not proof.

Operationalizing the script after the first run

Scheduled scans and diff-based alerts

Once the first run works, automate the diff.

I would schedule it daily or weekly depending on churn, then alert only on meaningful changes:

new in-scope host
new exposed port
certificate mismatch
unexpected removal of a production asset
confidence drop on a previously confirmed service

This is where the tool becomes genuinely useful. The value is not in one giant report. It is in the delta.

Ticketing, dashboards, and review loops

The output should land in a workflow people already use:

open a ticket for confirmed high-priority issues
attach the normalized record, not the raw vendor page
include first seen, last seen, and source confidence
let reviewers mark false positives and stale entries
feed that judgment back into the next run

That feedback loop matters more than a fancy dashboard. It trains the inventory to become less noisy over time.

What I would fix first and why

I would fix scope enforcement and normalization before I touched scoring or enrichment.

That is my clear ranking:

Scope enforcement
Canonical normalization
State storage and diffing
Retry and pagination handling
Scoring and reporting
Optional enrichment

Why that order? Because any mistake in the first two layers contaminates everything downstream. If scope is loose, you collect the wrong assets. If normalization is sloppy, you cannot dedupe them. Once those two are wrong, the best risk model in the world just decorates bad input.

So my practical view of AITEM, based on the public report and the usual shape of ASM tooling, is this: treat it as a source of evidence, not as a finished answer. Build the script so that every record can be explained, every score can be traced back to an observation, and every run can be compared to the last one.

That is the difference between attack surface management and a pile of results.