Changing the world one byte at a time

Tag Archives: tagging

My coding work lately has focused on a major upgrade to Planeteria by giving it a web framework (currently it has none).  The framework I chose was Flask, because of its simplicity, its flexibility, and its scalability.  It has a strong community and documentation, which makes our project more friendly to new contributors, and has several add-ons for features that we hope to implement/improve upon for the site down the road, which will make future improvements to the site much easier.  I’ve also done some reorganizing of the files so that there’s clearer separation of the business logic and presentation logic.  Again, this should make the site more friendly to new contributors who might want to edit the backend Python code without unintentionally altering the page content.  The page layout is in Bootstrap, which is another tool that is commonly used and well documented.  It’s often used by developers for prototyping, with some handy defaults for buttons, blocks of text, etc.   Although you may notice that many projects and websites that have a very similar look because of this, it provides a foundation that is very easy to alter through CSS that a skilled designer could do a lot more with.

This week I created a schema for the data in SQLite3 and did a crash-course in SQLAlchemy to work with that data.  Currently, Planeteria stores its data in sqlite with the help of a tiny tool that simply stashes it as if it’s a dictionary;  there is no schema.  This has been a huge frustration of mine, as it’s difficult to see exactly what data is stored in what format.  Not ideal when trying to debug Unicode issues and character escaping.

I now have planet data (name/description/slug) and a planet’s feed data (feed name/url/image/planet id) saving to a database, with the ability for a planet admin to update any of that data correctly.  With just a few sql commands, I can now check the database and see if my code did what it should with the data.  Huzzah!  This led to a few revisions of the schema, each time simply deleting the database and creating a new one with the updated schema, then more testing.  Last night, I did some rigorous testing, trying my hardest to “break” it (identify flaws).  The biggest test was using unicode characters and quotes; two things that cause problems right now if they’re entered in the admin page.  It saved and loaded the data exactly as it should!  It’s lovely when things just work as they’re supposed to.

The database does not yet save feed content data.  It’s still to be determined exactly how that will be saved.  I’ve talked before about adding tagging to the site to help people more easily identify content within a planet that is relevant to them.  Since the only users on the site are the planet admins, the planet admin would be responsible for assigning tags to each feed; but how would the admin correctly guess what tags would most interest the planet readers?  I think it just puts an undue burden on the planet admin for maintaining all the tags for their planet feed.  Additionally, admins can edit a blog’s feed data but they have no control over individual blog entries; the topics discussed by a blogger in the planet might vary greatly, making an individual tag that brings up that blogger’s entries only applicable to some, not all of those entries.

I’m not sure that adding individual user logins for readers to save personally selected tags is the right solution.  It seems overly complicated and potentially underused.  I’d want to wait to see if there is really demand for this; as it is, many readers are simply reading the feed in their own feed readers anyway.  A much simpler solution that would enable readers to find content most relevant to their interests is a Search window.

I bring all of this up because how the feed content data is used/accessed will guide the decision of how it’s saved.   A relational database would be an appropriate way to store feed tags, but if we want users to be able to search the content, storing it in a server designed for searching such as Solr or Elasticsearch might be more appropriate.

While I mull that over, there are plenty of other things to do with the Admin page functionality I’ve built so far: next on my plate is form validation, security, and error reporting.  Then, adding user logins, most likely with Mozilla Persona.

WFS Planet update:  Today, I updated a few feeds in the Women in Free Software admin page and was reminded again of a bug that makes feed images disappear or assigns them to a different person’s feed.  I’ve put those images back where they should go for now, but it may happen again.  It pains me to leave the live site in its broken state, but I’ve decided to focus my efforts on the Flask implementation instead and create a stronger foundation so that bugs like this can more easily be avoided in the future.  I’ve made sure that the javascript on the new admin page assigns unique ID’s to each feed’s data so that this can’t happen on the new Admin page once the Flask site is launched.  I can’t wait to see it in action!


Subtitle: I fixed it!

The bug that had prevented me from adding any more feeds to WFS is now fixed!  It was surprisingly easy; it had to do with an unescaped apostrophe throwing a wrench in the Javascript in the admin page.  The tricky thing that made it hard to diagnose is that it only threw an error  when you tried to save a new feed AFTER you had added and successfully saved the feed with the apostrophe in the Name field.  Normally, if I run across a bug in the Admin page (of which there are many), I would get an error when I try to submit a newly added feed that it doesn’t like.

I finally found the source when I noticed in Chrome Developer Tools that the Javascript was throwing an error right at the line that had a name with an apostrophe; I then looked at the other two planets that I know are affected and sure enough, it showed the same error at the feeds that had an apostrophe in the name field.  Because single quotes were used around the string value, the apostrophe was treated as the end of the string, when clearly it wasn’t supposed to end there.

Now that the bug is taken care of, I’ve added the remaining intern feeds for this OPW round, so you can now read all the OPW intern blogs on the Women in Free Software planet.  I added a few Google Summer of Code students as well; if there are any other female GSoC student who would like to be added, just let me know!

With these additions, the number of feeds on Women in Free Software planet is now approaching 100, which feels pretty big to me.  There are some folks who don’t always want to read all the blogs in the feed; for example, you might want to only read blogs of people who are working with the same coding language, or working with the same organization.  If this is you, I have a request and a suggestion:

(a) in case you’re not aware, Planeteria enables anyone to make their own planet.  You can make a planet with a much narrower scope if you like and read just those blogs that you want.  Just go to the homepage, fill out the two-field form, and start adding your feeds!  It’s that simple.  This is probably the best approach in the short term; however….

(b) there is a new feature coming down the pipeline that will enable you to sort the feed by tags.  Tell me what catgeory/ies you would like to be able to read within the feed, so that when the time comes to implement it, it will have tags that are useful to the planet’s readers.  Leave a note in the comments!


(Update: I’ve made a tl:dr version of this entry)

The first few weeks of my internship have been busy, yet progress on the project has been slower than I had hoped.  As planeteria.org is just a small project (I’m contributor #2), there are many things about Planeteria that need work, and most of it falls on my shoulders.  I’ve been working on finding and reporting bugs, a redesign, a user survey (you’ll hear more about this soon!), and documentation.  Each of these tasks has required a lot of yak shaving, since I am learning how to do this as I go.

Today I’d like to focus on one issue that I’ve been grappling with as I get my toes wet in this project: planet curation.

Before I dive into curation, however, I want to provide some background. Being somewhat new to the open source scene, planets are a new concept to me.  I soon learned that they’re commonly used in the open source world.  As a feed of blog entries by people working on a specific project, planets help you keep your finger on the pulse of what’s going on with the project.

Normally, when a group decides it wants a planet, they will set it up on their project’s website, often using code from either Venus or PlanetPlanet.  Because it’s being used for software development projects, they have the technical knowledge and access to a server, domain, etc to do this.  Planeteria, on the other hand, allows anyone to create a planet with a few clicks, and hosts the planet right there on Planeteria.org.  This makes planets accessible to a much wider audience, for purposes beyond software development and even beyond the FOSS community entirely.  As I work on improvements to serve a broader community, it’s important to keep in mind its roots in open source software development, what makes planets useful in that context, and how that’s applicable (or not applicable) in other use cases.

All of the OPW intern blogs are fed to the Women in Free Software (WFS) planet, so most of us are reading the planet to see what the other interns are working on.  I’ve taken on responsibility for administering the WFS planet and quickly learned why it’s called curation and not just administration.  There is a lot of work and thoughtfulness required to make a planet useful to its readers.

Curators need to keep an eye on the balance between posts that are directly related to the planet’s project or topic, and posts that are personal in nature or otherwise off-topic.  One of the benefits of a planet in the context of FOSS projects (and projects in general) is community-building: it can help get to know other project contributors on a personal level, as well as learning what they’re working on and thinking about regarding the project.  This assumes, however, that the planet feed mostly talks about the project or topic of the planet, with a sprinkling of personal posts thrown in.  You wouldn’t want a project’s planet to be flooded with entries about what they ate for lunch, celebrity gossip, personal health issues, and baby photos.  If all the readers see in the feed is personal posts, then the feed is not very useful to them to learn what’s going on with the project.  Readers may stop using the planet entirely if this is an ongoing issue, and then the planet is no longer useful.

Maintaining this balance is tricky.  The planet curator can’t control the content of a blog.  If the blogger is making use of tags or categories, there is a way to grab just the feed of one of their tags or categories.  But if the blog doesn’t have a tag relevant to the planet’s topic, or assigns the tag that is being pulled into the feed to their personal entries, then the baby photos and celebrity gossip will still appear in the planet’s feed.  And this requires some effort on the curator’s part to determine which blogs are spamming the feed with irrelevant posts, determine if there’s a category or tag they can use instead, and update the feed accordingly.

I’ve already had a few conversations with interns and mentors about ways to make the WFS feed more useful to them.  Several people have requested the ability to view only the blog entries by OPW interns, or at least a way to easily determine which blogs are intern blogs.  This begs the question of whether to create a separate planet just for the internship program, or whether there’s value in continuing to use the WFS planet and creating other mechanisms for readers to determine which blogs belong to the interns.  As a short-term solution, I’ve set it up to display the OPW internship logo next to each intern’s entry.

One feature that could be helpful in this matter is tag-based organization.  Each blog in the planet could be assigned tags that identify common topics the blogger discusses, or what organizations the blogger is affiliated with.  Then the readers could choose to display only the entries by bloggers that have a specific tag.  However this still doesn’t guarantee that the specific entries will be related to those tags, as only the blogger controls what content appears in their feed.  Taking a step back, though, I wonder if tagging is simply a band-aid for a poorly curated planet.  This feature could result in readers missing out on the bigger picture, which is one of the benefits of a planet.  If readers always filter their feed to display only a small number of blogs, they might miss all the buzz about something happening in the broader community.  For example, I learned that many of the people I’m working with knew Aaron Swartz personally, and it’s helpful to know that they’re grieving as I work with them.

In the context of a large open-source project’s planet, most or all of the Planet’s bloggers also reading the planet, and will notice if they’re spamming the feed (or be nudged by a cohort) and self-correct.  This self-correcting mechanism works well if there is an established community that communicates in other channels.  But what about a planet that is not focused on a project that has its own IRC channel (or several), but instead is focused on a topic of interest?  The WFS planet is somewhat grey area in this respect.  It includes the OPW program participants, who have an IRC channel and email list by which we can talk to each other, but it also includes blogs of other women in the FOSS community who (I’m pretty sure) are not aware that their blogs are included in this feed.  There are some pretty big names, which makes for great reading, but because they’re leaders in the field, I’m not sure that they would have the time to read the feed even if they knew their blog was part of it.  There are also several organizational blog feeds and a couple blogs with numerous authors (such as Geek Feminism), which means the authors are several steps removed from this planet, and are thinking of a much broader audience than just the WFS planet readers.  For those blogs, the self-correcting mechanism is lost, which makes more work for the curator.

Here’s where I would appreciate some feedback from the WFS planet readers (hello!):

Do you feel there are any blogs in the WFS planet that are consistently off-topic that you feel should be removed?  Any voices or general themes you feel are missing or underrepresented?

Would you like to have the ability to read only the OPW intern blogs at times, and read the whole feed at other times?  Or would a separate OPW internship program planet be more useful?  If we add tagging, what other tags would you like to use to sort the WFS planet feed?

I’ve discussed Planeteria with a few people over IRC in the OPW channel.  However, not all of the WFS planet readers are OPW participants, so please respond in the comments here instead.  If there’s enough interest, I could set up a WFS planet IRC channel to provide a way of communicating with the other readers and curators about the WFS planet.

I look forward to hearing your thoughts!