Parsing S-Expressions in C# using OMeta

It is easy to parse S-Expressions in C# with OMeta. Our code limits the grammar to lists, and atoms of string, symbol, and number types. So, it is not complete, but it can easily be expanded with OMeta. What motivated me to write this article was the lack of publicly available S-Expression parsers in C#/.NET.

Our parser converts the expression (+ (* 3 4 5 6) (- 7 1) ) to the following tree:

parsed-s-expression
where each vertex is represented by a C# class containing an ArrayList, Symbol, String, or Integer. Note that the expression (1) is different from the expression without parenthesis. The first is a list with one atom and the other is just the atom.

S-Expressions are a compact way to express programs and data structures. They were first defined for Lisp, but are used in a variety of areas including public key infrastructure. We use S-Expressions to define data flows in Egont, our web orchestration language. In Egont, each S-Expression produces a tree which is converted into a directed acyclic graph, the subject of a future post.

OMeta can be used under C# via the OMeta# project. That makes it more interesting since the classical lexical analyzer and parser generators such as Lex/flex and Yacc/GNU bison do not produce C# code. ANTLR is an interesting alternative but at the time of this post the latest version, ANTLR 4, does not support C#. OMeta’s ability to deal with ambiguities makes it more suited to playing with grammars. However, there are performance penalties in OMeta which must be taken into account.

Code

The code is available as SExpression.NET [github.com].

  1. Compile the RebuildParser project first
  2. Run the Test project
  3. The SExpression project contains the SExpression.ometacs parser and its related C# classes

See Also

  1. Egont, a [Social] Web Orchestration Language
  2. Egont Part II

Additional Resources

  1. IronMeta: another OMeta implementation in C#
  2. YaYAML: a YAML parser written in OMeta#
  3. OMeta Performance
  4. Domain-Specific Languages: An Annotated Bibliography

Egont Part II

(part I here)

Description

Egont is a shared space where users mashup personal information.
Its top goals are:
  • Discovering and curating new information in a personalized and dynamic way.
  • Promoting emergent behavior in a shared programming environment
  • Facilitating Serendipity.

Egont is a personalization environment where users can connect to, import, expose, and index data from their web services. They can also apply functions to build mashups around their personal interest like in a spreadsheet. On Egont, users can combine and exchange information. For example, users can connect their Egont accounts to a variety of services like movie rankings, and merge rankings from their social networks. If they want to find independent films they can filter out blockbusters. When users from their social networks update their rankings, these updates are processed and the result is automatically recalculated. The same idea can be applied to streams from Twitter or blog posts. One user can apply a filter to those streams to curate information apart from mainstream trends and recommendation systems, while other users can build new filters using this user’s data. Third parties can take advantage of the data flowing in this shared environment by developing new information functions.

Egont has a simple programming language where experienced users can access other user’s variable namespaces and handle security granularities to enable or restrict the flow of information. Less experienced users personalize their Egont experience using a simpler web interface.

Summary

Egont is composed of the following elements:
  1. A data flow engine
  2. A data store where cell values are persisted.
  3. A web application
  4. A simple programming language

Data Flow Engine

The data flow engine works like a spreadsheet. Some cells may be dependant on others. Values are recalculated only when necessary. For example, one cell may contain a function to retrieve new tweets, while another cell takes those tweets and uses a second function to extract named entities like places or proper names. Users can personalize the vast flow of information from many sources to process, aggregate, and filter information. The data flow engine limits recalculation to affected cells only.

The key feature of the engine is its ability to apply functions to a set of shared cells from other users. Another important feature is the handling of security settings. Users can configure which cells are shared with which users at a very granular level.

Web Application

The web application has two important parts. One is the editor where advanced users can use the browser to edit their Egont scripts. The other is a simpler user interface where users are able to define their sources of information and apply functions to them more easily.

Programming Language

The goal of Egont is to simplify the building of personalization and mashups, so its programming language is oriented to quickly orchestrating user information.

This is a rough example of how an advanced user could use Egont programming language to merge friends movie rankings.

friends <- [egont.users.alice, egont.users.bob, me] # list of friends.
movies_ranking <- imdb.ranking("swain-4") # persist my ranking on movies_ranking from my user on IMDB.
movies_average <- average(apply(friends, ’movies_ranking’)) # calculate the average of movies rankings from my specified friends. It only changes when rankings are updated
egont.feeds <- movies_average # expose the results as a feed in the web application.

Whenever any of the above users modify a movie’s ranking Egont recalculates that movie’s score.

With Egont,  we will have a place where we can discover new resources, research our interests, and create a community capable of sifting through the ever more vast sea of data available on today’s web.

See Also

  1. Parsing S-Expressions in C# using OMeta

Resources

  1. A Brief History of Spreadsheets
  2. Kahn process networks
  3. Directed acyclic graph
  4. Advances in IC-Scheduling Theory: Scheduling Expansive and Reductive Dags and Scheduling Dags via Duality
  5. Pregel: A System for Large-Scale Graph Processing
  6. Grzegorz Malewicz’s Google Research page
  7. CIEL: a universal execution engine for distributed data-flow computing
  8. Bloom Programming Language (via ComingThoughts)

Ideas: Egont, A Web Orchestration Language

Inspiration

Human curiosity goes beyond limited web applications, recommendation systems and search engines. People collect lists of things on the web. Things like music playlists, movie rankings or visited places are populating our web culture, but this information is spread out in different places and we need search engines, social networks, and recommendation systems to leverage it. The real-time web also offers transformation opportunities which are only limited by the imagination.

How can we adjust all this information to our personal or organizational needs? The semantic web could play an important role here, but the web is not organized semantically yet. However, it is possible today to give people tools to manipulate information at a personal and social level. Spreadsheets have hundreds of functions which are used by people with limited computer and mathematical skills. What if we could transform information in a similar way? What if a new stimuli, like a new tweet or a new ranked movie could trigger a cascade of processes?

People and organizations are sharing a record amount of data, but current web platforms tightly dictate the limits to its use. For example Twitter’s API has very small call rates for the general public. Most Twitter applications cannot retrieve more than one or two degrees of a user’s social network without working around these API limitations. Examples of API limitations abound, undermining the opportunities to leverage data potentials.

The inspiration for Egont was come from the idea of a social operating system. People do not only share data, they also share data transformations. Egont is a platform for writing simple code snippets, while allowing others to reuse them to extract new information. It is a shared pipeline which is focused on connecting people’s data and processes. It can be thought of as a living operating system: when a state changes, the dependant processes are recalculated. Although Egont has clear security controls it’s primarily oriented to data that can be shared, even providing tools for exporting information to be analyzed offline. The shift is from a perspective where users accept platforms applications, to a perspective where users do not only generate data but also processes. Users and third parties will be free to write new functions to extend Egont’s capabilities.

(continue to part ii)