Skip to main content

VectorAnalyzer

Type: vector_analyzer • Category: flow • Tags: vector, sentiment, ranking

Description

Advanced AI-powered semantic search engine that analyzes article collections using vector embeddings, cosine similarity scoring, and sentiment analysis. Finds the most relevant content by understanding meaning rather than just keywords, with intelligent ranking and filtering.

Parameters

NameTypeDescriptionRequiredDefault
datastringString representation of array of article objects with title, body, etc.no
querystringSearch query text for vector similarity (optional - if empty, returns all articles with sentiment only)no
fieldsfields_multiFields to search and analyze. Select from available fields in your data.no
top_percentagenumberPercentage of top results to return (1-100%, higher = more results)no40
sort_bystringSort results by: relevance/similarity (highest similarity first), dateno"relevance"
sortDirectionstringSort direction: asc (ascending), desc (descending)no"desc"
skip_sentimentbooleanSkip sentiment analysis for faster processingnofalse

Help

VectorAnalyzer performs intelligent semantic search on article collections using AI-powered vector embeddings.

Use {{ ... }} in parameters to pull from workflow vars/data.

How It Works:

  • Converts your search query and article texts into mathematical vectors using advanced language models.
  • Calculates similarity using cosine similarity (0.0-1.0 scale, 1.0 = perfect semantic match).
  • Automatically filters results using dynamic thresholds based on your top_percentage setting.
  • Optionally analyzes sentiment (positive/negative) for each matching article.
  • Supports flexible sorting by relevance or publication date.
  • If no query is provided, returns all articles with sentiment analysis only (no vector ranking).

Parameters Guide:

  • data: JSON string array of articles. Each article should have text fields like 'title', 'body', 'content'. Example: [{"title": "Tesla Reports Q3 Earnings", "body": "Tesla announced...", "date": "2025-10-01T10:00:00Z"}]
  • query: (Optional) Your search text. Can be natural language like "stock market volatility" or "cryptocurrency trends". If empty, returns all articles.
  • fields: Select which fields from your data to search and analyze. The dropdown shows available fields from your connected data source.
  • top_percentage: Controls result quality vs quantity. 40% (default) = top 40% most relevant. Use 100% for all, 10% for only best matches.
  • sort_by: 'relevance'/'similarity' (best semantic matches first) or 'date' (newest articles first).
  • sortDirection: 'asc' (ascending order) or 'desc' (descending order, default).
  • skip_sentiment: Set to true to skip sentiment analysis (2x faster, no sentiment scores in results).

Usage Examples:

  • Basic semantic search: query="artificial intelligence in finance", top_percentage=50
  • Search specific fields: Select 'title' and 'body' from the fields dropdown
  • Sentiment only (no search): Leave query empty, skip_sentiment=false
  • Find recent news: query="economic indicators", sort_by="date", sortDirection="desc", top_percentage=30
  • Find oldest news: query="historical events", sort_by="date", sortDirection="asc", top_percentage=30
  • Fast processing: query="market crash", skip_sentiment=true, top_percentage=20
  • From workflow data: data={{vars.articles}}, query={{vars.search_term}}

Understanding Results:

  • similarity: Score from 0.0-1.0 (higher = better match)
  • sentiment: 'positive', 'negative', or 'neutral' (if not skipped)
  • sentiment_score: Raw sentiment confidence (0.0-1.0)
  • rank: Position in results (1 = best match)

Tips for Best Results:

  • Use descriptive queries: "renewable energy stocks" works better than "green stocks"
  • Adjust top_percentage based on data size: smaller datasets may need higher percentages
  • Date sorting is useful for time-sensitive news analysis
  • Skip sentiment for speed when you only need relevance ranking

Performance Notes:

  • Processing time scales with article count and text length
  • Sentiment analysis adds ~2x processing time
  • Results are cached for repeated queries on same data

For questions or issues, check the similarity scores - if they're all low (<0.3), try different query wording.