Personal categorization

Stimulated by L. Efimova’s thoughts about personal categories and D. Pollard’s report about his getting-things-done experience, I decided to write down what my personal categorization strategy looks like.


My requirements are not so much current project support but rather sort of long range “knowledge” base.
(In the university, we seldom have projects with deadlines and dedicated budgets available. Rather, it is an ongoing effort to always have future-proof advice readily available, in case sudden funding comes down like rain in the desert and then again stays away for years.)

The requirement of gaining overview is more important for me than quick known-item search and retrieval. Frankly, it sometimes takes me quite some time to retrieve a note or a paper. But I usually don’t bemoan this time, because alongside the search path, I often remember interrelationships that had been lost from awareness, and once I found the item again I not only reopened several folders but also regained orientation in the particular area.

My category system needs not be understood nor shared by others (although this may change when we adopt more intranet KM tools).

All in all, my needs are more just-in-case than just-in-time.

My preferences regarding the flat vs. deep choice are fairly decisive against the flat structure. I hate to search my mind for search words, and I am rather unskilful in that, too. See below how I make my deep structure sufferable.

What filing strategy would suit all the above needs best, by date, by origin, or by subject? I think it’s by subject. With one exception: email.
(Since 1984 I have been filing my mail chronologically, initially because everything was collected in a single LOG.MISC or ALL NOTEBOOK on our mainframe, then because it was not easy to add outgoing mail to the incoming mail of the same originator/recipient (and I hated to accomplish this by simply appending all previous mails), and today because emails are still not individually linkable as separate entities in the filesystem.)

The main obstacles for filing strategies are, in my experience: the fact that it is now so easy, too easy, to save files (storage and transfer times are no longer a problem). So, the mere decision whether something is worth saving has become more difficult. (Saving too much, has many negative effects: increased searching time, increased garbage hits count, and increased backup time.) I have not solved this problem yet; I still tend to save too much, and yes, I save and categorize things that I never ever reuse. (After all, the mere size of the respective subfolders will later remind me on how important a particular area seemed to be.) The worst sort of saved files are the ones that were only halfheartedly collected, already doubting if they would ever justify their filing. And among these, the paper-based files of ever decreasing importance are the worst: if they shall be reasonably cross-linked with the electronic indexes, that’s especially cumbersome (and almost impossible when rearranging categories). But these paper documents are simply there, and it’s difficult for me to throw them away.

Category tree design

1. The first major choice is about nature and essence vs. relevance and density, i. e. whether

  • to classify by deep-grounded characteristics and features, leading most probably to obliquity of the tree with very unequal size of the branches, or
  • to be more pragmatic, stock-oriented. such that higher level categories are similarly sized (at least initially).

Examples for the former approach are, IMO,

  • the Propaedia of the Encyclopaedia Britannica, with categories like: “The nature and development of technology” > “Technology: its scope and history” | “The organization of Human Work”
  • Linné’s biological kingdoms,
  • Roget’s Thesaurus,
  • von Wartburg’s thesaurus: “A. L‘univers” | “B. L‘homme” | “C. L‘homme et l‘univers”


Examples of the latter approach are more popular within delimited scope areas, but also include universal systems like

  • Dornseiff’s thesaurus (20 top categories including some for more general words as “3. Space, Position, Form”, “5. Being, Relationship, Event”, “9. Will and Act”, but OTOH including 17 more practical categories as “17. Devices & Techniques” and “18. Ecomomy”),
  • Dewey’s library classification (approximately balancing the 10 by 10 by 10 topmost categories).

Often, debated classification systems differ already in the top levels, which might be overcome by simply mapping them onto each other. But the underlying attitude to categorize in the former or latter style will tend to be different all the way down to the lower levels, if the principal design choice is not agreed. Therefore, debates about the toplevels are often so unnecessary but tough and frustrating. My choice is the latter, more pragmatic system, and it does contain also some more general categories for the things I do not yet sufficiently understand, and it’s surprising how, sometimes, later developments justify the cautious approach.


2. The second design criterion is how to cope with change . How often do I have to rearrange the categories? Not very often.

  • Smaller or more general areas grow or differentiate into large subtrees. But the current few relevant ones can still be linked to the top using shortcuts, in a way like a fish-eye perspective which is gliding through the overall tree.


  • Often, new offspring branches do not grow from the ideal superior folder, but somewhere where an improper subject seemed to be related, perhaps because the new stuff was simply connected to the parent, much like a paper clip clamps an otherwise isolated attachment. The provenience principle (“by origin”) is often stronger than the “by subject” principle. Just leave these new twigs and branches where they grew; Using a “see-also” cross reference, one can still link them into the proper hierarchy, and using desktop shortcuts, one can honour their relevance.
  • Very often, new areas would be ideally placed on bridges between two older areas, because interdisiplinary fields are often the most productive ones. Rather than creating an independent new category for them, let it grow out of one of its parent categories while cross-linking them to the other.Whenever I have tried to distinguish such an intersection set as a subset from its superset in terms of “more general” vs. “more specific”, this approach soon failed because it was not robust and clear enough. It’s better to categorize the intersection in terms of “rather … than …”, probably by origin, and cross-link them (Neighboring interdisciplinary field can be pictured as houses in a narrow romantic Italian lane, where two vis-a-vis balconies are much closer than the entrance doors, with a laundry rope connecting them. The laundry probably belongs to either of the two inhabitants, but that does not affect the picture because they are linked.)
  • After a while, the labels may not make much sense any more because, for instance, their constituent hype words may have changed. I am using rather cryptic abbreviations for my folders, anyway, that only need to allow some association during the initial engraining phase. After a while, the labels just conduct an independent life of their own.
  • Sometimes, two historically differing folders turn out to contain similar items later. If they can be cross linked using “see also”, that’s no problem. Often they were created during different phases or waves, along with other new phenomena (hypes are often encountered in groups or even as twins). This slight “by origin” flavor within the “by subject” alignment may be a welcomed mitigation to the taxonomy rigor, as long as the similar folders can be cross-linked.


Once in a while, however, parts of the category system need an overhaul. I use to rearrange approximately one quarter of the major folders between christmas and new year. This is where visualization comes in. (Unlike H. Haller I quoted in a recent blog entry (#46), I do not believe that the entire filing system or sitemap can be visualized, simply because of its quantity.) For rearrangement, however, it is ideal to see all the historical and interdisciplinary interrelationships on a map. If a tree or subtree structure emerges, that’s fine. If not, the relationships in one’s head are probably too new or weak, and the corresponding folders may remain for another while in a network and on the same hierarchical level.

Concrete steps

In practice, the procedures are as follows.

  • The core of the filing strategy and the first stage in the lifecycle of nearly every file is the desktop (in my case, Windows). Here, new items lie as reminders/todo’s, until they are done, ending up with moving them
    • either into the recycler bin,
    • or into the folder hierarchy.
  • Perhaps the folder hierarchy starts with a new folder on the desktop, comprising a few items (much like a paper clip).
  • When the project associated with this new folder grows, I move it into a folder farther down the cabinet hierarchy, but immediately after this, I move the mouse with the right button pressed in the reverse direction, creating a shortcut on the desktop leading to the new, currently relevant, folder.
  • As relevance shifts, shortcuts are added and deleted.
  • If assigning a superior folder is not quite straightforward, I include a link from the chosen folder to the almost-chosen folder.
  • Similarly, if it takes me too long time looking for a subfolder in the wrong places, I add a link from where I vainly searched it to where I finally found it.

Obviously, this extensive shortcutting effects major efforts when moving target folders. Sometimes Windows finds the relocated folders, but not in case of nested moves, so I don’t count on this. I hope someday there will be features to make it easier.

Of course, I have also some special folders, e. g. a flat folder for “thick” files, or special kinds like files without content but just a file name serving as a reminder, or folders with special icons without content except similar subfolders whose names reflect paper folders in a real-world rack, or quickly notepadded notes files with very extemporised filenames, e. g. simply called merk.txt (remember.txt), or special (flat) favorite folders called “shortcuts”, “archive” and “treasure box”.


Without shortcut links, I would be lost. Using them allows me to rest on fairly deep and fairly long-term filing categories.

This entry was posted in Knowledge management. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s