. /../An idea: Characterization.../ 123
written by Armok on Jul 17, 2009 14:19
The Characterization Markup Language:
"History" of the idea:
For quite some time, maybe about a year, I have had this idea about a markup language to describe characters, real or fictional, whit greater precision and less ambiguity than can be (easily) archived whit natural language, as well as being more compact and encouraging going into greater detail. I have made attempts as formulating such a language for use in describing the characters of a minor novel that never got of the ground as well as some other similar things, but my inexperience in language construction meant the attempts did not result in anything useful for communication.

I let the idea rest for a while, occupied whit things not related to writing fiction involving complex characters, and waiting to discover a relevant media to present the idea in.

some time ago, I thought I had found such a media, the forum of a popular webcomic supposedly frequented by both authors and engineers. It turned out there were no engineers, and that authors have a vague idea of what a markup language even is at best. The topic quickly derailed and died.

Thus I post here, were I think there might be people geeky and missionary enough to grasp this concept. and also because I want more involvement whit this community.

What is this thing?
Mostly what it sounds like really: a markup language for describing a character.

By a character means anything that could be described to have some kind of mind, goal oriented complex behaviour, or personality, a character may be real or fictional, you are a character, your dog is a character, your roleplaying character is a character, an AI (both the kind you kill in a game and the sapient scifi kind) is a character, and if there is a god he/she/it is a character to. the focus is mostly on minds whit a similar level of complexity as a human, but it should be possible to use it to describe all of the above.

Like all languages, it should be highly flexible within it's area. One should be able to author a 300 page code about some person of such complexity that someone who read it will know them better than they would by living whit them for years, or think up a few tags that sums up a two-dimensional disposable NPC in a single line, or anything in between those extremes.

The CML will almost certainly be in the format of plaintext, whit one or more characters described in a .txt file, and structured into various kinds of tags that contain all the information about the character and might refer to one another in some manner. other than that not much is decided. Will I use <> or [] for tags? Will indentation have a meaning beyond readability? Should it be mostly self contained or highly interconnected? What tags will there be and what syntax will they use? None of these things are decided, because I do not know what is the best solution to archive the goals, and I need YOUR help and advice to determine that.

There are however several core concepts and important tools that almost certainly will be represented and heavily used, there include but are not limited to:
  • URL references to web resources. especially such as TVtropes (very useful for describing complex but commonly used character traits, it's almost as if the site was made for CML) or Wikipedia (I can imagine a convention of referring to Wikipedia for almost any concept outside the ones explicitly defined as part of the language)
  • Tags for formal psychological concepts, such as personality types and values, it makes sense to use the systems that already exist for numerically describing personalities, and it also makes sense to when trying to describe minds use the concept from the science that deals hit minds.
  • Many different tags specific to the language, describing among other things preferences, goals, moral principles, beliefs/ideologies, etc. (note that these most likely would describe the ETHICS of the character, not the morals, how well the character is at following it's ethics hopeful would be better described whit some more concrete psychological measurement like mentioned in the above list item.)
  • As most systems for character description uses the format of "what would the character do in X situation" there will probably be some kind of tag formalizing this notion, it is also good at handling exception situations.
  • And almost certainly several important ones I have not thought of.
    Goals of the project:
    Simple/already archived in some other system:
  • describing the gross physical characteristics of
    a character. (archived in innumerable system, Every RPG whit stats and a drawing of a character/monster has this, and as the CML is for describing MINDS the physical description need only be to the extent that it is part of the characters core identity, or can be skipped entirely without much lost. the main important things to specify here is really only age, gender, and possibly species/race.)
  • A basic numeric description of the characters personality. (several systems for this exist whiting psychology. there exist some non-clinical systems as well, such as how the game Dwarf Fortress describes character personality.)
  • capability of describing a character well enough to give a fairly good image of what kind of person he or she is, at least as well as a short description in English. (I cant think of any system that really does this for all aspects except natural languages, but there are several systems that comes fairly close, such as "pile of links to TV tropes articles".)
  • Goals that should be well within reach:
  • Has some kind of system for describing:
    *Identity, defined as what/who the character identifies her/him/itself as.
    *Other things that are mayor, permanent influences on the characters decisions.
    *Meta cognitive or self control abilities, such as self awareness, laziness, discipline, willpower, etc.

    And other important psychological characteristics.
  • Giving an impression of the character as good as you'd get form having a long talk whit it, or reading a book that had the character as the protagonist.
  • having enough precision that whit a good CML one can determine what the character would do in most situations.
  • being a useful tool for writing fiction, especially if doing so collaboratively.

    Hard, maybe impossible (read: insane and unrealistic) Goals:
  • Accurately modelling the characters internal states, thoughts, motivation, and navel-gazing philosophy.
  • System for describing characters preferences habits and personality well enough that one can reproduce the kind of artwork they make/would make.
  • some kind of system for reverse-engineering a CML file from what actions the character took in various situations.
  • Describing a character in as much depth and detail that a CML document of sufficient size might describe more than is usually gained even through years of close relationship.
  • describing yourself in the language is a useful tool for self insight.
  • CML is a powerful alternative for specifying the personality of a sapient AI, and describing yourself at great detail in the language and inputting it into the AI is a viable method of brain uploading.
  • If one has a suitable detailed CML file, one could determine exactly how the character would behave in a specific situation, staring down to what choice of words they would use to say something, as long as the decisions are not random. This is reliable enough one can trust the method of making a description of themselves in the language and have that used to make important decisions if they are not able to.

  • -

    So what do you think? feel free to discus this system and what might be possible, but remember the main purpose of this thread is to discus what is the best approach to make it become reality,
    "gheeh!" (c)h.azuma
    written by Yayo on Jul 18, 2009 01:20
    hmm.. interesting idea. : )
    sadly I don't have time right now, since it's very late and I'm going to sleep.
    It's a complex topic and need to be discussed deeply.
    I'll try to reply something tomorrow.

    For now the only one thing I can think of is that perhaps you could just use XML (eXtensible Markup Language) and just develope a set of tags and attributes for your purpose.
    : )

    written by Ajax on Jul 18, 2009 01:49
    Yayo said:
    For now the only one thing I can think of is that perhaps you could just use XML (eXtensible Markup Language) and just develope a set of tags and attributes for your purpose.
    : )

    When I read the first bit of it, that's what I was thinking of, too. XML is very flexible and might be able to do what is needed for this project/idea.

    Sadly, I haven't read enough into it at this point to be of further help, but I like the idea so far. Let's see what some other members say...
    written by Armok on Jul 18, 2009 01:55
    On XML:
    That IS something I didn't think of, maybe worth looking into, but my guess is we'll discover it a bad idea as the language has a very specific purpose that isn't really helped by being compatible whit other things, it also isn't intended to be parsed by a computer which removes a lot of the need, it has rather special requirements that I do not know if XML has, and then the general law of things becoming more efficient and more elegant when specialized and independent.
    written by Ajax on Jul 18, 2009 11:29
    XML is used for many things. I used to work at a mortgage company that used XML for one of their web-based applications.

    I haven't used XML myself, but I do know that it might be helpful in what you're trying to achieve.
    r'lyeh sweet r'lyeh
    written by Neuzd on Jul 18, 2009 16:26
    I also have been thinking to XML since the beginning, and I was waiting for Armok to give us some more practical example or better explain the purpose.

    XML is used really for everything, everywhere and it is a valuable solution to countless applications.
    The reason is that it is very flexible and it integrates itself with almost every programming language used today.

    This flexibility means that if your goal is very detailed, then efforts should be focused on the structure of the file and the way it is parsed.
    Parsers can be written in any language but they must be designed with the same set of rules that must be followed when writing a new Character File.
    it could be of any complexity, additional rules can also be embedded in the file, but again, parsers and XML basic structure have to "match".

    I think that the easiest way to begin is writing down a kind of example (maybe some simple ones) of what you have in mind, and no, I don't mean in XML.
    What exactly will be there?
    Do informations have some effect on some other value?

    Maybe you're trying to improve some other guy's idea and you can give us an example of that?

    I don't know, since now it's all very foggy...
    written by Armok on Jul 18, 2009 18:02
    I'm starting to get rather annoyed, I read the wikipedia page on XML and using it for this is NOT a god idea, for two main reasons.

    1) THIS IS NOT A COMPUTER LANGUAGE! Markup != computer. The type of data stored requires AI whit a solid theory of mind and thus there is little point in making the syntax computer parseable either. This language should be equally useful if you write it on parchment in candlelight as if you use a computer.
    XML is a computer language, if you read the wikipedia page almost none of the advantages of XML are relevant, and almost all the disadvantages are.

    2) Dimensionality.
    XML is used for a variety of areas, but they really all have in common that the XML is just a frame around the natural language that make up the real information that the human is supposed to read. It describes how, when, were, and in what format the text or images shuld appear, abut the actual text and images are not XML.
    WHit CML on the other hand you have a single abstract dimentionless object, that you describe the propeties of, and ALL of it shuld be in code, whit no or as litle as posible natural language in it. Also, the order in wich you asign propeties is irelevant.
    If you want an example of what it might look like the Dwarf Fortres RAW files are a much better example:
    	[PREFSTRING:terrifying features]
    As you can see, this alredy almost comes close to what I want to do, but it describes something different.
    First, it's still made for computer parsing, and thus could likely be made even more human readability, second, it is describing the general and relative physical and psychological characteristics of a species, rather than the detailed and specific psychology and mind of a single individual, and thirdly, and most importantly, it is made for a computer game and thus only have to support the finite, static and known amount of features implemented in the game, rather than the near infinite unknown and ever changing ones of a real mind.
    Still, on a syntax level it does the same thing: describes in detail a long series of properties of a single object.
    hello! :) felysian
    written by Hello! :) on Jul 18, 2009 19:25
    Technically XML is just a pre-made parse tree.

    Your example could be done in XML like this:
    <creature id="GOBLIN">
      <name singular="goblin" plural="goblins" adjective="goblin" />
      <appearance style="default" tile="g" color="7:0:0" />
      <appearance style="glow" tile=""" color="4:0:1" />
      <evil />
      <intelligent />
      <likes-fighting />
      <property name="bonecarn" />
      <canopendoors />
      <prefstring>terrifying features</prefstring>
      <body type="humanoid">
        <part use="NOSE" />
        <part use="2LUNGS" />
        <!-- etc -->
      <attack type="main">
        <!-- fill it in yourself -->
      <size value="6" />
      <fat value="2" />
      <equips />
      <nocturnal />
      <standard-flesh />
      <!-- more stuff -->
      <personality trait="ANGER" value1="25" value2="75" value3="100" />
      <!-- etc -->
    How you choose to define the structure is entirely up to you, you can use attributes or you can use tags (see personality). You can create other tags and be able to refer to them (e.g. body parts).
    written by Armok on Jul 19, 2009 02:39
    a) that was copied from the Dwarf Fortres RAWs, and not an example of CML. b) That looks much harder to read to me, and much less clear in structure, and much more annoying to write.
    i do my own stun-- avatars
    written by Albeyamakiir on Jul 19, 2009 04:07
    Ok, after a quick examination of the wikipedia article on markup languages, I'm not entirely sure that what you're after is a markup language at all. You want a language to describe characteristics of a person and want it to be human readable, not computer readable. Wikipedia's page says:
    Wikipedia said:
    A markup language is a set of annotations to text that describe how something is to be structured, laid out, or formatted.
    The very definition implies that it would be perfect for computers (even though it was used well before them) and also that it doesn't stand alone. Therefore, I first believed what you wanted was a set of standards.

    However, you say:
    Armok said:
    whit greater precision and less ambiguity than can be (easily) archived whit natural language, as well as being more compact and encouraging going into greater detail.
    Which sounds like characteristics of XML or most other computer markup languages. When criticising XML, you say
    Armok said:
    It describes how, when, were, and in what format the text or images shuld appear, abut the actual text and images are not XML.
    But no ML can do images by itself, and I doubt someone's name can be written using only ML. The closest you could get is what Hello did, except with different syntax. In which case, you could take the best bits of XML, DF Raws and whatever else to make your own. If this still isn't right, perhaps you can describe it differently?
    written by Armok on Jul 19, 2009 05:02
    I think it will qualify as a markup language, at least technically, but it will most likely be very different from any existing markup languages. It will be a set of annotations that describe how a mind is structured.
    Yes, markup languages are perfect for computers, and 99% of all markup language usage if for computers, but it dosn't actually say anything about that in the definition of a markup language.
    the precision, lack of ambiguity, compactness etc. seem more characteristics of markup languages in general than any specific one such as XML or CML.
    I did not intend to implicate CML would contain images, but it wouldn't reference images either (in the direct sense that say an html document does, it can still have some tag that points to were to find a portrait of the character or something along those lines).

    To Draw the best features from many different markup languages, or programming or natural or any other kinds of languages for that matter, was my intent from the beginning, and there is almost certainly some things to learn from XML and it's derivatives. What I argued against was making the CML into such a derivative.
    written by Armok on Jul 19, 2009 10:15
    Ok, I'm reading about personality types on wikipedia, and thinking about CML and if I should have XML like traits or not. Then inspiration hit me.

    I was not going to post any syntax, because I wanted people to come up whit new creative solutions that were wildly different from each other rather than simply copying one suggestion just because it happened to come form the OP, but I have no idea were to post this otherwise.

    Please be harshly (but constructively) critic to the syntax I propose, critique will either cause me to realize something I did wrong so i can fix it, or something other get wrong about what I'm after so I can explain it.

    This is preliminary, it is just an brainstorming idea and might not be used, I talk in absolute terms only because I find it easier:
    CML is a language for describing a series of traits, these traits are organized into a structure similar to a file hirarch.
    For example, to say that a character likes bananas one could write:
    [Relation:Preference:Stimuli:"taste of/eating"="banana"=+0.9]
    Here, Relation means it's something that relates to a specific object, type of objects, or concept that does not have special status in CML, as opposed to an universal and abstract Trait of the character. 0.9 refers to how much the character prefers the stimuli. Hopefully the rest is self explanatory.

    As you can see, this is much simpler, more efficient, and easier o understand than simply saying someone likes bananas. :p

    However, it can be long and bulky to write this for many different things, therefore:
    [Relation:Preference:Stimuli:"taste of/eating"="fruit"=+0.5]
    <Relation:Preference:Stimuli:"taste of/eating"=*>
    this translates to "the character likes fruit, but liking bananas more; as much as in the last example. He also likes apples slightly more than other fruit, but he dosn't like grapes more than other things despite being a fruit, but dosn't dislike them either"
    as you can see, preference is relative: the preference of every individual fruit is adjusted by the general like of fruit, and has to be adjusted back if he is to not like it especially.
    However, what is more important to note here is how the XML like syntax is used: the XML style <tags> cant do anything on their own but instead works as shortcuts for the [tags] that can actually assign values to properties.
    Lets call the XML style <tags> "modifying tags" and the [tags] that that can actually say something about the character "telling tags". There is no reason for this and the names are completely arbitrary, feel free to discard this and suggest something better, I just like to stablish jargon.

    Modifying tags are not only shortcuts, they have one other function: logic and conditions: for example:
    <IF "time"==("morning" OR "monday")>
    This translates to "If it's early in he morning, or Monday, the character is Dominant, forceful, assertive, aggressive, competitive, stubborn and bossy, but otherwise she's Deferential, cooperative, avoids conflict, submissive, humble, obedient, easily led, docile and accommodating."
    Note that this means the CML dosn't correspond very well to psychological theory, or maybe it does and this is simply a nonhuman character.

    Presumably, Modifying tags of some sort would be used for separating different characters in the same document, or separating the different fractions of a split personality (like Gollum/Smeagol, the thing that is sometimes wrongly referred to as schizophrenia) but I don't want to dwell into the syntax of such things just yet.
    Now there is one problem whit this design, namely that the CML needs to be simple, intuitive, and user friendly. I have a very strange intuition and suck at user friendliness.

    Once again, i need suggestions and ideas, the things detailed in this post probably wouldn't work at all.
    written by Ginrai on Jul 19, 2009 17:20
    Wow, this is certainly some undertaking! You're a brave man, Armok!

    I don't have any particular feedback on the structure of the examples you've given so far. I would however like to stress that, no matter how well structured and flexible the markup may become, it is all for nought if the end product lacks meaning for the reader.

    <IF "time"==("morning" OR "monday")>
    This above example is neat, readable and succinct. However, the syntax operates on very broad terms and is very much open to the reader's interpretation. When is "morning"? 12:00am to 12:00pm? If the character being described is woken at 7am or 3am, is their dominance level always still ~90%?

    For that matter, what is "90%"? 90% of my idea of what "dominance" is? 90% of your idea?

    What is "dominance" anyway? You describe a dominant character as "forceful, assertive, aggressive, competitive, stubborn and bossy", but that isn't necessarily the same perception myself or others would have.

    For the above CML to hold absolute meaning for the reader, a frame of reference would have to be established elsewhere in the document. Perhaps a CML document would comprise two sections; one establishing a frame of reference, the other indicating how the character compares to that frame of reference...?

    And yes, I am aware that the quoted example was simply intended as an overly-simplified suggestion of structure, but I'm just trying to provide a little food for thought : )
    written by Armok on Jul 19, 2009 18:43
    Thanks! This is just the kind of feedback I need.

    You are corect that "morning" is not defined, and when you think abaut it mondays and mornings have something in common; a more proper method would have been refering to the level of tiredness rather than spesific times when a person is usualy tired.

    Dominance on the other hand is very well defined: you might have noticed the "16PF", it refers to http://en.wikipedia.org/wiki/16_Personality_Factors
    The ~90% was just arbitarly chosen because I couldnt find the unit it's normaly mesured in, and it seemed there was a maximum and a minimum, so it refers to 90% of maximum.

    (my firefox is acting up so no spellcheck this post, consider this an insight in how hoplesly reliant i am on technology.)
    written by Anderson on Jul 20, 2009 19:20
    a better way to do the food would be to give a general mix of flavors that the character likes

    {taste prefs}
    (sweet: -3)
    (sour +5)
    (bitter -2)
    (salty +2)
    (umami +9)

    and then assigning values for the foods that are modified by cooking and mixing(adding salt would modify salty, obviously)

    something like
    [taste values]
    (sweet .5)
    (sour .1)
    (bitter .6)
    (salty .3)
    (umami .0)

    this would cause characters to come to a more reasoned list of food preferences, and might lead to someone "likes kittens for their deliciousness."

    Anyways, I'm still having trouble conceiving a similar system for personality. Most personality graphs are really a group of graphs, if one could distill that, you could get the value of a personality to apear as a pair of Cartesian coordinates.
    reading this thread
    no members are reading this thread
    . /../An idea: Characterization.../ 123
    63734, 10 queries, 0.139 s.this frame is part of the AnyNowhere network