Query expressions
DRAFT DRAFT DRAFT DRAFT
These are my raw notes on section 7.16 of the C# Language Specification. Section 7.16 falls within section 7 on expressions.
Heuristic Model
- Notes . Add notes for each section.
- Definitions . Add definitions for the chapter.
- Examples . After adding definitions, then add examples.
- Edit . After adding examples, then edit for readability etc.
My Personal Conventions
- terminology is italicized
-
code is in
back ticks
Intro
query expression syntax is similar to that of relational and hierarchical query languages
-
begins with
from
clause -
ends with either
select
orgroup
clause -
after the initial
from
can come zero or more of these clauses-
from
-
let
-
where
-
join
-
orderby
-
from
clause is a generator and includes:
- a range variable...
- which ranges over the elements of a sequence
let
clause
- introduces a range variable
- representing a value computed by means of previous range variables
where
clause
- is a filter
- that excludes items from the result
join
clause
- compares specified keys of the source sequence
- with keys of another sequence
- yielding matching pairs
orderby
clause
- reorders items
- according to specified criteria
select
or
group
clause
- specified the shape of the result
- in terms of the range variables
into
clause
- can "splice" queries
- by treating the results of one query
- as a generator in a subsequent query
Ambiguities
The way to mixing contextual keywords into strings.
-
from
-
where
-
join
-
on
-
equals
-
into
-
let
-
orderby
-
ascending
-
descending
-
select
-
group
-
by
The above are keywords when they occur anywhere within a query expression.
To use these keywords within a query expression, prefix them with @
from @select
in (new string[] { "from", "select" })
select @select
Where a query expression is any expressions that
-
starts with
from <em>identifier
-
followed by any token except
;
=
or,
Translation
The steps for turning a query expression into fluent syntax.
- C# does not specify the execution semantics of query expressions.
- Rather, the compiler translates query expressions into methods
Where | |
Select | |
SelectMany | |
Join | |
GroupJoin | |
OrderBy | |
OrderByDescending | |
ThenBy | |
ThenByDescending | |
GroupBy | |
Cast |
-
These methods must have particular
- signatures
- result types
- instance methods of the object being queried, or
- extension methods that are external to the object.
- [I'd like to see an example of override the Linq Extension Methods]
- is a syntactic mapping
- occurs prior to any type binding or overload resolution
- is guaranteed to be syntactically correct
- is NOT guaranteed to produce semantically correct C# code
- the resulting methods are invoked as regular methods
- and this may result in normal method call errors
- the compiler applies each translation section in order
- each section is applied exhaustively
- once exhausted, a section is not later revisited in the same query
- assignment to range variables is NOT allowed in a query expression, though this rule need not be strictly enforced in all C# implementations
- certain translations inject range variables with transparent identifiers denoted by
1.
Select
and
groupby
clauses with continuations
- from ... into x ...
- translates into
- from x in ( from ... ) ...
Example
from c in customers group c by c.Country into g select new { Country = g.Key }
becomes
from g in ( from c in customers group c by c.Country ) select new { Country = g.Key }
then becomes
customers.GroupBy(c => c.Country) . Select ( g => new { Country = g.Key } )
2. Explicit range variable types
from
- from T x in e
- translates into
- from x in (e).Cast<T>()
join
- join T x in e on k1 equals k2
- translates into
- join x in ( e ).Cast<T>() on k1 equals k2
Example
from Customer c in customers where c.City == "London" select c
becomes
from c in customers.Cast<Customer>() where c.City == "London" select c
then becomes
customers.Cast<Customer>().Where(c => c.City == "London")
Note
The .Cast<T>() operates on each object in the collection (as opposed to casting the collection).
3. Degenerate query expressions
A degenerate query expression is one the trivially selects the elements from the source.
- from x in e select x
- translates into
- ( e ).Select(x => x)
Example
from c in customers select c
becomes
customers.Select(c => c)
Notes
-
if a query expression includes only a degenerate query,
- then the translation appends a .Select()
-
that said, if there are further translations
- a later phase of the translation
- will replace the degenerate query with just its source
- it is important to ensure that the result of a query expression is not the source
- lest we reveal the type and identity of the source to the client of the query
- [why would that be problematic?]
4.
From
,
let
,
where
,
join
, and
orderby
clauses
A query expression with a...
...second
from
clause followed by a...
- This is the SelectMany. It isn't a query continuation.
-
The
select
clause has access to the range variable from both the first and secondfrom
clauses.
...
select
clause
- from x1 in e1 from x2 in e2 select v
- ( e1 ) . SelectMany ( x1 => e2, ( x1 , x2 ) => v )
- from c in customers from o in c.Orders select new { c.Name, o.OrderId, o.Total }
- customers.SelectMany(c => c.Orders, ( c, o ) => new { c.Name, o.OrderId, o.Total } )
something other than a
select
clause
- from x1 in e1 from x2 in e2 ...
- from * in ( e1 ) . SelectMany( x1 => e2 , ( x1, x2 ) => new { x1, x2 } )
- from c in customers from o in c.Orders...
- from * in customers.SelectMany( c => c.Orders, ( c, o ) => new { c, o } ) ...
Recall that the * is the transparent identifier. It captures multiple range variables and later becomes an anonymous object or function. In the above case, it later becomes new { x1, x2 }
Note, in both the above examples, the
range variables
of both
from
clauses stay in scope; that is, both are available in subsequent clauses.
let
clause
The variable defined within the let clause has access to the initial range variable and, along with it, is available through the rest of the query.
- from x in e let y = f ...
- from * in ( e ) . Select ( x => new { x, y = f } ) ...
- from o in orders let t = o.Details.Sum(d => d.UnitPrice * d.Quantity) ...
- from * in orders.Select(o => new { o, t = o.Details.Sum(d => d.UnitPrice * d.Quantity ) } ) ...
where
clause
- from x in e where f ...
- from x in ( e ).Where ( x => f )
- from o in orders where o.Id > 0
- from o in orders.Where(o => o.Id > 0)
join
clause
without
an
into
followed by a
select
clause
- from x1 in e1 join x2 in e2 on k1 equals k2 select v
- ( e1 ) . Join ( e2, x1 => k1, x2 => k2, ( x1, x2 ) => v )
something other than a
select
clause
In this case, the transparent identifier * holds the place of the anonymous new { x1, x2 }
- from x1 in e1 join x2 in e2 on k1 equals k2
- from * in ( e1 ) . Join ( e2, x1 => k1, x2 => k2, ( x1, x2 ) => new { x1, x2 } )
join
clause with an
into
followed by a
The
into
makes the
join
into a group join.
select
clause
The output here is the initial range variable x1 and the group formed from the second range variable x2. In other words, x1 remains in scope but x2 doesn't because it's behind g.
- from x1 in e1 join x2 in e2 on k1 equals k2 into g select v
- ( e1 ) . GroupJoin ( e2, x1 => k1, x2 => k2, ( x1, g ) => v )
something other than a
select
clause
- from x1 in e1 join x2 in e2 on k1 equals k2 into g ...
- from * in ( e1 ) . GroupJoin ( e2, x1 => k1, x2 => k2, ( x1, g ) => new { x1, g } )
orderby
clause
- from x in e orderby k1, k2, k3 ...
- ( e ) . OrderBy ( k1 ) . ThenBy ( k2 ) . ThenBy ( k3 ) ...
followed by
descending
- ( e ) . OrderByDescending ( k1 ) . ThenByDescending ( k2 ) ...
5.
Select
clauses
- from x in e select v
- ( e ) . Select ( x => v )
The
=>
is a projection from each value of
x
into
v
. If
v
is simply a repeat of
x
, then the translation is just
( e )
.
6.
Group
by
clauses
- from x in e group v by k
- ( e ) . GroupBy ( x => k , x => v )
The exception is when
v
is the identifier
x
, in which case the result is
( e ) . GroupBy ( x => k )
7. Transparent identifiers
-
some translations *inject range variables with transparent identifiers
-
the
*
denotes these - they are NOT a proper language feature
- rather, they exist only as an intermediate step during translation
-
the
*
into either
- anonymous functions
- anonymous object initializers
-
when a
*
occurs as a parameter in an anonymous function,- then the members of the associated anonymous type,
- are automatically in scope in the anonymous function body
*
occurs as a member of a declarator in an anonymous object initializer
- then it introduces a member with a transparent identifier
*
are always introduced with anonymous types
Pattern
- Types can implement this pattern to support query expressions on those types.
-
Types have flexibility in how they implement query expressions.
-
implement as
- instance methods or
- extensions methods,
- because the invocation syntax is identical
-
implement as
- delegates or
- expression trees,
- because anonymous functions are convertible to both
C<T>
that supports query expressions.
Terminology in Approximate Order of First Occurrence
-
query expression
- any expression that starts with "from identifier"
-
followed by any token except:
-
;
-
=
-
,
- Prefix those with @ if we want to use any of those in a string.
-
- a line of code
- that evaluates to a value
- a part of a statement
- that does not constitute a complete statement
-
a special type of routine
- that controls the iteration behavior of a loop
- yields values one at a time
- a generator has parameters
- other code can call a generator
- a generator generates a series of values
- because generators yield values one at a time
- instead of returning all the values at once
-
create these in a
from
orlet
clause - stores each subsequent value that a generator yields
- read-only
- forward-only
- one item at a time
- can be lazily generated
- potentially infinite
- http://stackoverflow.com/questions/2627172/the-difference-between-lists-and-sequences
- white space and comments are not tokens
-
the following are tokens
- identifier
- keyword
- integer-literal
- real-literal
- character-literal
- string-literal
- operator-or-punctuator
-
from
-
select
-
group
-
by
-
let
-
where
-
join
-
on
-
equals
-
into
-
orderby
-
ascending
-
descending
- Where
- Select
- SelectMany
- Join
- GroupJoin
- OrderBy
- OrderByDescending
- ThenBy
- ThenByDescending
- GroupBy
- Cast
- first into another query
- then into Methods
-
the variable immediately following the
from
-
represented with
*
- exists only as an intermediate step in query translation
- later steps turn it into anonymous functions or anonymous object initializers
- tend to capture multiple range variables as members of a single object
- trivially selects the elements of the source
- [this prevents calling code from being able to modify the source]