Mesopotamia
The Land of the Four River Banks

Data

Version 1
Giorgio Buccellati – March 2021


Home

Tabularity
The epistemology of tabular databases
Categorization
Queries

     Back to top: Epistemological system(s): data

Tabularity

     A database is generally understood to be tabular in nature: the cells are juxtaposed within a matrix that spells out the parameters for each row and column, according to categorization systems that can be very complex. A database as such does not develop an argument, since cells to not explicilty refer to each other and thus do not develop a sequential and inferential thread that leads to any possible conclusion. In its simplest format, a tabular database can be schematically rendered as follows:


Fig. 1

If we look at cell 4b, the content (X) presupposes contents in the adjacent cells (Y and Z), but does not in and of itself lead to them. Thus there is a an implied argument (the categorization system that puts in relation X with Y and Z), but there is no explicit thread that links X, Y and Z, much less one that draws any conclusions from it.
     The 4banks websites do not have tabular databases (except in a limited way in the indices), but the RECORD section (Part II) of each website is in effect the conceptual equivalent of a (tabular) database. The information is presented without a narrative thread that links the various items.
     On the other hand, each item in the database has explicit links to various other narrative layers (the arrows to the outside in Fig. 1) and within the items themselves (arrow between X and Z). This gives a special dynamic dimension to our databases, one that helps actualize the potential of the digital medium.

     Back to top: Epistemological system(s): data

The epistemology of tabular databases

     A database, exhibits connectivity among cells and sequentiality among rows or columns. A cell is linked to another through shared attributes that are clearly defined and serve as pegs for the linking. A row is parallel to another row, and a column is parallel to another column, and they can be compared because there is a specified set of parameters that defines the individual cells.
     What is fundamental in this regard is the categorization system which governs the definition of the pertinent parameters. The query power of a database, wihch sets in motion logical thought, is only correlative to the quality of the underlying categorization. The effort that goes into establishing a proper grammar for the data is very much in the nature of an argument, and often a very sophisticated one at that.
     However – tabular databases are argument driven only in a passive sense: the argument is all in the categorization system that is upstream of the database itself, and the relational dimension among cells and rows and colums is only potential. The argument power of the categorization system lies in the ability to extract all the meaningful attributes of the data in question, and to articulate them "grammatically," i. e., in a way that is systemically well coordinated and explicitly structured.
     It has to be activated through a query function that spells out the parameters for the search, and the query originates in a human argument brought to bear on it from the outside. This query is also part of the argument, as it aligns itself with the what the categorization system has specified in the first palce.
     In spite of their correlation, an entry in any given cell is not explained by the contents of the neighboring cells. They are related in function of the categorization system which operates upstream of the database, but they are not interacting with each other. What activates their relationship is a query, which is in function of an argument brought to bear from the outside. This (human) argument is the result of (dynamic) digital thought applied to the (static) datpistemological power.

     Back to top: Epistemological system(s): data

Categorization

     Data base management programs make it feasible to create very complex arrays, ideally suited for sorts, queries, and a variety of statistical inferences that can be drawn from them. They are in universal use, and have introduced a new way to think data. In this respect they have achieved, we may say, an independent epistemological status: they shape our way to organize the "given" (the data) by adding another layer of "givenness," namely the structure within which the categorization has to take place.
     Categorization is what drives the conceptualization of the data by providing a hidden argument that governs the structuring of the material to be analyzed. It establishes rosters of attributes that are organized according to complex hierarchical structures, and in this lies the espitemological dimension of a databse: it is not a simple container with items that are roughly arranged; it is rather a highly articulate and internally differentiated conceptual framework.
     Categorization is thus an implicit but powerful argument because it defines preset parameters reflecting a structural understanding of the universe to which the data are expected to belong. Categorization is upstream of the data, because it is formulated before the data are coded, and it reflects both the predetermined structure of the system and the general perception of the data.

     Back to top: Epistemological system(s): data

Queries

     While categorization provides an implicit argument upstream of the database, queries rest on arguments that are formulated downstream of the database, i. e., after the database has been completed and it is open for use. In that case, since categorization is a known given, and since we use it as a starting point, an argument that is formulated outside of the database can draw on the data in a variety of multiple combinations.
     There is a correlation between categorization and queries (i. e., the argument upstream and the argument downstream), because the power of the query is related to the power of the categorization. Categories are the nerve points that lead to clustering according to shared attributes, and can in this regard answer questions posed by the argument external to the database, or at least they can provide material for the argument to identify an answer and proceed further with more questions.
     In this lies the powerful epistemological value of databases: they organize knowledge by formalizing data and thus opening them to a dynamic access. By way of comparison, we may think of a printed dictionary: here, too, data are organized according to a well defined categorization system, but the categories and the corresponding attributes are limited and the access is not dynamic as with a digital database.
     Even so, digital databases are passive in the sense that they do not, in and of themselves, develop an argument. Relations are indeed made explicit by virtue of the attributes that are shared (hence the notion of a "relational" database), but they are not active on the basis of an explicit thread that links them in order to arrive at a conclusion. The argument is either pre-existing (categorization) or is brought to bear on the data from the outside (queries). The data information syndrome arises in this context: the data acquire an existence of their own, independent of the argument.

     Back to top: Epistemological system(s): data and argument