LazyFrame - Optimized Expression-Based Filtering¶
LazyFrame is a lazy evaluation engine that queues multiple Expression-based filters and applies internal optimizations (predicate fusion, short-circuit evaluation) before materializing results.
JavaScript
const result = gaslamp.LazyFrame.from(df)
.filter(new gaslamp.Expression("age").ge(18))
.filter(new gaslamp.Expression("status").eq("active"))
.collect();
When to Use LazyFrame¶
Use LazyFrame when¶
- You need to apply multiple filter conditions efficiently
- You want a declarative Expression-based API
- You have complex conditions (AND/OR/NOT combinations)
- You want built-in predicate optimization
Use BareFrame.filter() when¶
- You only need a single filter
- You need custom filter logic
- You need other DataFrame transformations:
select(),rename(),withColumn()(these are not available in LazyFrame)
Basic Usage¶
Multiple Filters (AND Logic)¶
Chaining multiple filter() calls combines conditions with AND:
JavaScript
const result = gaslamp.LazyFrame.from(df)
.filter(new gaslamp.Expression("age").ge(18))
.filter(new gaslamp.Expression("status").eq("active"))
.collect();
// Internally fused: age >= 18 AND status = 'active'
Expression Operations¶
LazyFrame supports all Expression operations for filtering:
JavaScript
// Comparison
.filter(new gaslamp.Expression("age").gt(20))
.filter(new gaslamp.Expression("age").ge(18))
.filter(new gaslamp.Expression("age").lt(30))
.filter(new gaslamp.Expression("age").le(25))
.filter(new gaslamp.Expression("status").eq("active"))
.filter(new gaslamp.Expression("status").ne("inactive"))
// String operations
.filter(new gaslamp.Expression("name").startsWith("A"))
.filter(new gaslamp.Expression("name").contains("li"))
.filter(new gaslamp.Expression("email").endsWith("@example.com"))
// Array operations
.filter(new gaslamp.Expression("age").in([25, 30, 35]))
// Null checks
.filter(new gaslamp.Expression("phone").isNull())
.filter(new gaslamp.Expression("phone").isNotNull())
See Expression Filtering for the full API.
Complex Conditions¶
AND Logic¶
JavaScript
// Chained filters are combined with AND
const result = gaslamp.LazyFrame.from(df)
.filter(new gaslamp.Expression("age").ge(18))
.filter(new gaslamp.Expression("department").eq("Engineering"))
.collect();
// Or build the expression directly:
const expr = new gaslamp.Expression("age")
.ge(20)
.and(new gaslamp.Expression("department").eq("Engineering"));
const result = gaslamp.LazyFrame.from(df).filter(expr).collect();
OR Logic¶
JavaScript
const expr = new gaslamp.Expression("age")
.lt(18)
.or(new gaslamp.Expression("age").gt(65));
const result = gaslamp.LazyFrame.from(df).filter(expr).collect();
// Rows where age < 18 OR age > 65
NOT Logic¶
JavaScript
const expr = new gaslamp.Expression("status")
.eq("inactive")
.not();
const result = gaslamp.LazyFrame.from(df).filter(expr).collect();
// All rows where status is NOT 'inactive'
Complex Combinations¶
JavaScript
const adults = new gaslamp.Expression("age").ge(18);
const engineers = new gaslamp.Expression("department").eq("Engineering");
const active = new gaslamp.Expression("status").eq("active");
const expr = adults.and(engineers).and(active);
const result = gaslamp.LazyFrame.from(df).filter(expr).collect();
Lazy Evaluation¶
LazyFrame delays filter execution until collect() is called.
This enables optimizations like:
- Predicate Fusion: Multiple filters are combined into a single predicate
- Short-Circuit Evaluation: Evaluation stops as soon as a condition fails
JavaScript
// Create lazy pipeline
const lazy = gaslamp.LazyFrame.from(df)
.filter(new gaslamp.Expression("age").ge(18))
.filter(new gaslamp.Expression("status").eq("active"));
// No filtering happens yet — filters are just queued
// Evaluation happens here
const result = lazy.collect();
Practical Examples¶
Find Active Users Over 30¶
JavaScript
const activeAdults = gaslamp.LazyFrame.from(users)
.filter(new gaslamp.Expression("age").gt(30))
.filter(new gaslamp.Expression("status").eq("active"))
.collect();
Find Employees in Specific Departments¶
JavaScript
const engineers = gaslamp.LazyFrame.from(employees)
.filter(
new gaslamp.Expression("department").in(["Engineering", "R&D"])
)
.collect();
Multi-Condition Query¶
JavaScript
const highValue = gaslamp.LazyFrame.from(users)
.filter(
new gaslamp.Expression("accountValue")
.ge(10000)
.and(new gaslamp.Expression("status").eq("active"))
)
.collect();
Comparison: LazyFrame vs BareFrame¶
LazyFrame (Expression-based)¶
JavaScript
const result = gaslamp.LazyFrame.from(df)
.filter(new gaslamp.Expression("age").ge(18))
.filter(new gaslamp.Expression("status").eq("active"))
.collect();
BareFrame (Predicate-based)¶
JavaScript
const result = df
.filter(row => row.get("age") >= 18)
.filter(row => row.get("status") === "active");
Note: LazyFrame provides optimized predicate fusion when chaining multiple Expression-based filters. BareFrame applies filters immediately and converts data layout on each step. Use LazyFrame for multiple chained filters, BareFrame for single filters or custom logic.
Notes¶
- LazyFrame is read-only and does not mutate the source DataFrame
- All chained filters are combined with AND logic by default
- Use Expression's
.and(),.or(),.not()methods for complex conditions - LazyFrame is optimized for GAS environments with memory constraints
- Always call
collect()at the end to execute the filters and get a BareFrame (without it, you only have a queued pipeline that hasn't been evaluated) - Empty DataFrames are returned unchanged (no rows match the filters)
- Do not modify an Expression after passing it to
filter()
See Also¶
- Expression Filtering - Full Expression API reference
- Filtering Rows - BareFrame.filter() guide
- DataFrame Basics - Core BareFrame operations