Posts tagged: opensource

Report from OSCON 2010 – Sessions Day 3

11:52am: Perrin Harkins is going to build on Tim Bunce’s Devel::NYTProf talk by addressing how to actually speed up Perl bottlenecks after you identify them. Wait, no, now he’s blaming the database end of things for most performance problems, so that’s not strictly a Perl bottleneck at all… but in my experience, he’s right… it’s usually your SQL, database connection overhead, or something like that in modern apps. It’s always in the I/O.

Now he’s talking about the overhead incurred by ORMs… object relational mappers, frameworks like Rails or Grails or, in the Perl world, DBIx::Class. This is where I get to grin and feel like I am making a smart choice by writing all my SQL raw (and conveniently ignore any development time efficiency I might gain by switching to an ORM). Since I don’t have the luxury of DBAs to tune indexes and whatnot, I’m guessing I’m still better off staying intimate with MySQL.

Ah, some actual Perl optimizations now:

  • Slurp files when possible (unless too large), don’t read them line by line off the disk.
  • Use a ‘sliding window’ to read large files.
  • Text::CSV_XS is wicked fast… don’t parse those CSVs by hand. (if performance matters).
  • LWP: not so speedy, if it matters. LWP::Curl, much faster. Or if you’re hitting a lot of different URLs, HTTP::Async for concurrent connections.
  • eliminate startup costs with something like mod_perl or FastCGI. Check.
  • compiling Perl without threads can buy you 15%, if you don’t need threads… but now you need to maintain your own Perl…

Interestingly, I only seem to take detailed notes during optimization talks… a sign that this topic interests me too much. ;)

Well, time for the closing keynotes and then lunch/evening festivities. Time flies when you’re being a geek.

11:14am: Enjoying Patrick Michaud’s talk on Rakudo Star… the first ‘usable’ Perl 6… all of which I’ve already been privy too in lurking the Perl 6 blogs etc… so I’m spending most of this time trying to figure out how to get ‘git’ to use a different transport than SSH… so I can keep our local installation of Rakudo up to date in my latest ‘protect me from myself’ firewalled vantage point on our network. Grrrr.

10:11am: Louis Suarez-Potts, PhD of Oracle is telling the story of Open Office… another slice of open source pie that came to the company as a result of the Sun acquisition. He makes the point that the architecture of your code shapes community participation in its development; basically, it’s the argument for plugins/APIs/modularity in software design, which the only way to properly distribute programmer workloads in order to maximize efficiency… you can’t have too many cooks with their hands right in the core of your codebase.

After speaking broadly on several aspects of community forming, he concludes with a somewhat rousing critique of ‘commodity culture’, and encouraged the thirty some-odd attendees not to think of his project, Open Office, as a commodity. I think everybody understands this about all software on some level, but vary as to how we articulate and respond to that.

10:00am: Allison Randall just announced that OSCON would definitely be in Portland again next year, to wild acclaim. It seems that holding the event in San Jose last year was extremely unpopular: everyone loves Portland. Me too.

9:48am: Simon Wardley is taking about management philosophies… a somewhat rare digression from technical topics here at OSCON… but after all, this is one of the keynotes, and you’ve got to keep it lite. I’m hearing a handful of familiar words: Agile and Six Sigma, for example, and the relative strengths and weaknesses of each. Simon says that Agile excels at innovation, but sucks at managing predictable processes. Six Sigma, he says, excels and sucks inversely. And no I wouldn’t put ‘sucks’ in anyone’s mouth; Simon has an informal speaking style.

Report from OSCON 2010 – Sessions Day 2

Ed, as it turns out, was only the beginning.

After a Hunter S. Thompson-esque jaunt through the Lloyd Center district, culminating, thankfully, in not being robbed at the Motel 6, here I am once more at OSCON for a technically edifying day in the Northwest.
10:39pm: Lunch next to the Expo Hall was better than expected. After, I decided to bring the MacBook Pro back to the room… finally tired of lugging it around and seeing the wisdom of netbooks and iPads. In the afternoon I attended the Perl Lightning Talks, which were a mix of really neat 5 minute demonstrations and silly entertainments… all in good fun.

11:28am: And now for a jQuery UI session, perfect for me. I’ve been wading into the jQuery world to improve my user interfaces for some time now, and am about to really get my hair wet. I subscribed to the official jQuery podcast and have been working through the episodes on my commute.

10:51am: I was hoping to attend License To Fail, but apparently so wasn’t everyone else; it filled up fast, and the ushers closed the doors. Instead I’m in Programming Websockets, an interesting enough talk that I *think* is about bringing statefulness to the web, but as the standard is still evolving, it’s far too bleeding edge to get excited about right now.

9:48am: now a guy’s picking on C++ and Java syntax. Hard to argue, but, waiting for his bright idea…

Ah, finally the rub: he’s selling the ‘Go’ language (not the game). Go is meant to be a perfect compromise between compiled (statically typed) and interpreted (dynamic) languages.

9:34am: The reception for the guy from Microsoft sounds like SNL’s Sarcastic Clapping Family. He’s from their interoperability department… which I assume sits adjacent to legal, where the patents get drawn up.

Ever repeat a word so many times, it loses all meaning? Ready, set, go: cloud, cloud, cloud, cloud, cloud, cloud, cloud, cloud, cloud, cloud, cloud, cloud, cloud, cloud, cloud, cloud, cloud, cloud, cloud, cloud, cloud, cloud, cloud, cloud, cloud…………

9:28am: A woman from SETI is calling us ‘earthlings’. Finally, I belong.

9:08am: a Facebook guy is talking about “HipHop for PHP”, which is PHP compiled down to C++ compiled down to a fast binary all in the service of… making Facebook pages load faster. And of course, they’ve open sourced it. UNH can consider using this once we earn several ga-jillion page views per day… that is, we don’t need it. Nice to know it’s there though. And I wasn’t aware that “Facebook has been developed from the ground up using open source software.

[continue to Day 3]

Report from OSCON 2010 – Sessions Day 1

The appetizers of the first 2 days have been eaten, and the meat of the conference is now being served. I’ll be liveblogging the sessions from here on out… most recent thoughts at the top:

8:21pm (PST… naturally): I’ll fess up; the final two session slots today offered little of interest to me. Instead, I retired to the hotel bar, and ended up in extended conversion with Ed Rynerson, 87 year-old newspaper distributor. I learned more in two hours than I would have at 10 computer programming conferences. Thanks, Ed… I’ll never forget it.

2:30pm: Next up for me: a session on Devel::NYTProf by its author Tim Bunce. This is a module that profiles how much time each section of your Perl code takes to compile and run, to help you find speed bottlenecks. Measure twice, cut once. A couple of tips on optimization, once you’ve determined it’s needed:

  • exit subs as early as you can
  • profile known workloads, don’t worry about tuning for datasets far larger than you’ll ever actually deal with
  • add caching *if appropriate*… easy to introduce bugs here
  • don’t create objects (expensive) that don’t get used
  • rewrite hot-spots in C

I love this piece of advice that Tim has repeated at least thrice now: after you make an optimization, re-test, and if it runs fast enough now, put the profiler DOWN, and walk away. Compulsive performance tuning is a common affliction.

The talk is about the conclude however, and I wish there had been examples of how to profile code running on the web rather than at the command line. Admittedly, I haven’t messed with any of this yet, and it might be dead easy.

1:51pm: Patrick Michaud, Perl 6 implementation superhero, is now showing us some Perl 6 basics. This is becoming an annual treat/tease for me. The upcoming ‘Rakudo Star’ release is not considered production ready, but rather a ‘usable’ release of the new Perl 6 language. Getting closer… I swear it’s true. It’s nice to see these folks taking their sweet time with Perl 6, raising it like a child, rather than hormone-fed beef. It’ll be better for us this way… when we, uh, eat the child. Ok, no more mixing metaphors for me!

11:32am: Eric Day runs a session on Drizzle. The minimalist website echoes the minimalist “lightweight” design… for instance, if you’re not using stored procedures, there is no need to have this feature enabled and impacting performance in your RDBMS. Another interesting tidbit is that Drizzle (a fork, or derivative, of MySQL 6) has completely dumped the MySQL users table for authentication. I like that, I think.

11:01am: Now I’m in a FOSS and EDU session. Here’s an interesting program: the Professors’ Open Source Summer Experience. Think FITSI, but with a focus on open source.

And now something for students… Undergraduate Capstone Open Source Projects. Funny quote from the slide: “The culminating experience of one’s undergraduate experience is an NDA [non-disclosure agreement]“.

9:59am: Marten Mickos is up now, formerly CEO of MySQL AB, before the Sun (and ensuing Oracle) acquisition. Now he’s heading up the Eucalyptus cloud computing platform. His comment about combining his passions for open source and for making money got some laughs… but he’s right. There’s far less tension there than people think (especially, I’m imagining, in a cloud…)

But wait: Eucalyptus is available as both an on-premise or off-premise service. Is on-premise really the ‘cloud’ though? At a certain point we’re just buzzword compliant here… cloud is becoming a term for distributed, load-shared or virtualized systems whereas it used to just mean “that internet out there…”

9:00am: Well here I am at OSCON 2010, a gathering of the best, brightest and myself (heh) here in Portland, Oregon. The Open Source Conference.

Tim O’Reilly kicks off the keynotes and I am immediately reminded: these are the techno-hippies. Psychedelic visions previously confined to the brain are now pixels on the screen, and this is clearly a gathering of technologists who think in far broader terms than ROI and cost-benefit analysis. Open source is, at the core, a very self-conscious social and political movement, which I wish more people in IT understood, and would get behind.

This is becoming more explicit as the keynotes proceed. Jennifer Pahlka has now taken over to push an effort to bring open source software to municipal governments.

The CTO of the District of Columbia is up now. He leaves platform selection decisions to the technologists themselves… to the people who’ll actually implement his solutions, where the rubber meets the road. He hires experts and trusts them. Get this revolutionary off the stage, he obviously doesn’t know how to manage his subordinates properly.

On a side note: the wireless access here at the Oregon Convention Center, supporting some 3,000 geeks on laptops and handhelds, is fast and flawless. This makes me all the more angry at the hotel, whose wireless access shits the bed ritually, in support of only a couple hundred users. When will hotel wireless reach the same amenity status as running water? I woud have gladly traded one for the other at several points this week. Maybe that’s why us geeks sometime smell a little… suspect (back to the techno-hippie theme…)

[continue to Day 2]

If The White House Can Do It…

… why can’t we?

It was announced yesterday that the White House IT folks have released some of their Drupal modules to the open source community. I am used to thinking of government as a dusty dinosaur whose bureaucracy keeps it behind the times. So this surprised me.

It seems we lack institutional support and direction on contributing to open source here at UNH. Is this true? We certainly use open source like gangbusters; we’ll take all the quality free software we can get. But do we contribute back? Are we feeding the virtuous circle, or simply leeching it?

In the midst of trying to better monetize our intellectual property, who amongst our best minds can articulate the corners where capital isn’t actually king?

And, why do I rant like this? Probably because I am more mad at myself than anything for not having made a meaningful contribution to computing in the public domain.

I just want to be like the White House staff, and do it on the clock. ;)

For Love or Money

“…how much sense does it make to lock an investment into a technology, the first and last thought of whose practitioners is how much can they can squeeze you for? How much sense does it make to lock an investment into a technology that is avoided by those who do quality work, not for money, but for its own sake?”

That’s from a neat little post by Jeffrey Kegler called “Ringo Starr and Willy Sutton On Programming Languages“. Sums up my thoughts on why so much commercial software sucks despite the price tag.

More than one way to skin the Database Cat

I’m in the planning stages of a Perl application where I need some sort of database engine back-end with pretty basic requirements. I’ve surprised myself in picking what, today, has become an unconventional choice; DBM. For long time Unix practitioners, DBM is well known and there have been many re-implementations of the original idea (NDBM, SDBM, GDBM, etc.). Whatever the iteration, the basic idea is the same; a library of routines that are loaded into the application’s own address space and provides a basic key/value based mechanism for storing and retrieving records. See Wikipedia DBM article for a brief explanation and history of DBM.

Of course many application developers have forgotten all about DBM, assuming that it is an obsolete technology. The assumption today has pretty much become that the database back-end engine will of course be a relational DB with an SQL query language interface. But between applications that need all the power and features of a full-blown relational database, and those that only need low-level read/write operations of general purpose file system, there is a middle ground that a DBM-like database engine fills very nicely.

As it turns out DBM style databases are not dead at all but are actively being developed. QDBM (Quick DataBase Manager) is just one of several modern DBM-like open source database libraries available for deployment. The QDBM web page graciously lists a short description of some of its “brothers” where author Mikio Hirabayashi writes:

There are many followers of UNIX DBM. Select the best suited one for your products. NDBM is ancient and you should not use it. SDBM is maintained by Apache Project, and GDBM is maintained by GNU Project. They are most popular and time-tested. TDB is maintained by Samba Team. It allows multiple simultaneous writers. While CDB does not support updating at a runtime, it is the fastest. Berkeley DB is very multifunctional and ACID compliant. It is used in many commercial products. Finally, QDBM is balanced of performance, functionality, portability, and usability.

For my own project I’ve decided to give Berkeley DB a try. It appears to be feature rich, well supported, and perhaps most importantly, is already installed on the system where I will be deploying my application. :-)

For more information about the DBM approach to database management for applications, and about Berkeley DB in particular, I recommend checking out the first chapter of the Berkeley DB Programmer’s Reference Guide. It’s an interesting read and spells out the case for when a DBM style database is, and is not a good fit an application.

OpenSolaris – The Better OS

So you are Linux Demigod? Have you seen OpenSolaris. What other OS can you plop on a lowly PC and scale it to more than 128 processors running multiple cores. It can run Linux code, virtualize itself or other OS’s, removes the painful init.d and it’s start/stop scripts, has a kernel tracing system called Dtrace that plops exception probes right in a running kernel, has a far superior kernel thread mechanism than anything out there, and it enjoys the benefits of ZFS: Zettabyte File System — a file system that can handle double bit parity RAID 10 or V, create backup snapshots, implement pools of cheap storage into aggregates of simple virtulized storage (why even go HW Raid?). And the finale: It is openSource!

Panorama theme by Themocracy