Skip to content

LazyFrame - Optimized Expression-Based Filtering

LazyFrame is a lazy evaluation engine that queues multiple Expression-based filters and applies internal optimizations (predicate fusion, short-circuit evaluation) before materializing results.

JavaScript
const result = gaslamp.LazyFrame.from(df)
  .filter(new gaslamp.Expression("age").ge(18))
  .filter(new gaslamp.Expression("status").eq("active"))
  .collect();

When to Use LazyFrame

Use LazyFrame when

  • You need to apply multiple filter conditions efficiently
  • You want a declarative Expression-based API
  • You have complex conditions (AND/OR/NOT combinations)
  • You want built-in predicate optimization

Use BareFrame.filter() when

  • You only need a single filter
  • You need custom filter logic
  • You need other DataFrame transformations: select(), rename(), withColumn() (these are not available in LazyFrame)

Basic Usage

Multiple Filters (AND Logic)

Chaining multiple filter() calls combines conditions with AND:

JavaScript
const result = gaslamp.LazyFrame.from(df)
  .filter(new gaslamp.Expression("age").ge(18))
  .filter(new gaslamp.Expression("status").eq("active"))
  .collect();

// Internally fused: age >= 18 AND status = 'active'

Expression Operations

LazyFrame supports all Expression operations for filtering:

JavaScript
// Comparison
.filter(new gaslamp.Expression("age").gt(20))
.filter(new gaslamp.Expression("age").ge(18))
.filter(new gaslamp.Expression("age").lt(30))
.filter(new gaslamp.Expression("age").le(25))
.filter(new gaslamp.Expression("status").eq("active"))
.filter(new gaslamp.Expression("status").ne("inactive"))

// String operations
.filter(new gaslamp.Expression("name").startsWith("A"))
.filter(new gaslamp.Expression("name").contains("li"))
.filter(new gaslamp.Expression("email").endsWith("@example.com"))

// Array operations
.filter(new gaslamp.Expression("age").in([25, 30, 35]))

// Null checks
.filter(new gaslamp.Expression("phone").isNull())
.filter(new gaslamp.Expression("phone").isNotNull())

See Expression Filtering for the full API.


Complex Conditions

AND Logic

JavaScript
// Chained filters are combined with AND
const result = gaslamp.LazyFrame.from(df)
  .filter(new gaslamp.Expression("age").ge(18))
  .filter(new gaslamp.Expression("department").eq("Engineering"))
  .collect();

// Or build the expression directly:
const expr = new gaslamp.Expression("age")
  .ge(20)
  .and(new gaslamp.Expression("department").eq("Engineering"));

const result = gaslamp.LazyFrame.from(df).filter(expr).collect();

OR Logic

JavaScript
const expr = new gaslamp.Expression("age")
  .lt(18)
  .or(new gaslamp.Expression("age").gt(65));

const result = gaslamp.LazyFrame.from(df).filter(expr).collect();

// Rows where age < 18 OR age > 65

NOT Logic

JavaScript
const expr = new gaslamp.Expression("status")
  .eq("inactive")
  .not();

const result = gaslamp.LazyFrame.from(df).filter(expr).collect();

// All rows where status is NOT 'inactive'

Complex Combinations

JavaScript
const adults = new gaslamp.Expression("age").ge(18);
const engineers = new gaslamp.Expression("department").eq("Engineering");
const active = new gaslamp.Expression("status").eq("active");

const expr = adults.and(engineers).and(active);

const result = gaslamp.LazyFrame.from(df).filter(expr).collect();

Lazy Evaluation

LazyFrame delays filter execution until collect() is called. This enables optimizations like:

  • Predicate Fusion: Multiple filters are combined into a single predicate
  • Short-Circuit Evaluation: Evaluation stops as soon as a condition fails
JavaScript
// Create lazy pipeline
const lazy = gaslamp.LazyFrame.from(df)
  .filter(new gaslamp.Expression("age").ge(18))
  .filter(new gaslamp.Expression("status").eq("active"));

// No filtering happens yet — filters are just queued

// Evaluation happens here
const result = lazy.collect();

Practical Examples

Find Active Users Over 30

JavaScript
const activeAdults = gaslamp.LazyFrame.from(users)
  .filter(new gaslamp.Expression("age").gt(30))
  .filter(new gaslamp.Expression("status").eq("active"))
  .collect();

Find Employees in Specific Departments

JavaScript
const engineers = gaslamp.LazyFrame.from(employees)
  .filter(
    new gaslamp.Expression("department").in(["Engineering", "R&D"])
  )
  .collect();

Multi-Condition Query

JavaScript
const highValue = gaslamp.LazyFrame.from(users)
  .filter(
    new gaslamp.Expression("accountValue")
      .ge(10000)
      .and(new gaslamp.Expression("status").eq("active"))
  )
  .collect();

Comparison: LazyFrame vs BareFrame

LazyFrame (Expression-based)

JavaScript
const result = gaslamp.LazyFrame.from(df)
  .filter(new gaslamp.Expression("age").ge(18))
  .filter(new gaslamp.Expression("status").eq("active"))
  .collect();

BareFrame (Predicate-based)

JavaScript
const result = df
  .filter(row => row.get("age") >= 18)
  .filter(row => row.get("status") === "active");

Note: LazyFrame provides optimized predicate fusion when chaining multiple Expression-based filters. BareFrame applies filters immediately and converts data layout on each step. Use LazyFrame for multiple chained filters, BareFrame for single filters or custom logic.


Notes

  • LazyFrame is read-only and does not mutate the source DataFrame
  • All chained filters are combined with AND logic by default
  • Use Expression's .and(), .or(), .not() methods for complex conditions
  • LazyFrame is optimized for GAS environments with memory constraints
  • Always call collect() at the end to execute the filters and get a BareFrame (without it, you only have a queued pipeline that hasn't been evaluated)
  • Empty DataFrames are returned unchanged (no rows match the filters)
  • Do not modify an Expression after passing it to filter()

See Also