MySQL & Latin1 woes … E4X … Pirahã … Alcabala 22nd of June, 2007 POST·MERIDIEM 10:53
Okay, for that eager plethora heh of my readers who have heavily used non-ASCII UTF-8, who stored that UTF-8 with metadata saying that it was latin 1 in a MySQL 4.1.11 database, and who need to move to a MySQL 5.0.41 where such thoughtless trust of the program not to corrupt your data no longer works, I have enlightenment on how to do that migration.
mysqldump -u
user-name --password=password --default-character-set=latin1
database-name
latin1
with utf8
everywhere that’s not in your
data. mysql -u user -p
database-name-not-password,-sorry-to-confuse-you <
modified-dump-file-name
and give the password for the
database server (which you have configured, right? Right?). ECMAScript for XML is the unwieldy name of a recent standard for processing XML data
with JavaScript, and after a couple of days working with it, I’m totally
impressed. For me, the initial stumbling blocks were the namespaces and
xmlns
; but a
var jdfns = new
Namespace("http://www.CIP4.org/JDFSchema_1");
and a
subsequent addressing of all elements (attributes, etc) with
jdfns::element-or-attribute-name
resolved that quickly
enough, yay. But after that, for example, I commented one evening that were
I particularly perverse, I could parse a configuration file to find an
integer ordering of some set of attributes; the next day, I found this
translated to ten lines of code. Thoroughly recommended if you use XML and
JavaScript on a regular basis, though perhaps irrelevant if your code needs
to function on Internet Explorer.
And thirdly, in energetic contrast to Emma and Simon’s take on it, I found this New Yorker article on Daniel Everett’s work with the Pirahã really, really good. It’s a magazine article, and as such it doesn’t try to treat the linguistics in detail, any more than the recent article on Григорий Перельман treated the mathematics of the Poincaré conjecture, but it deals well with communicating the sociology of the disagreements the Pirahã provoke; it quotes Michael Tomasello in a critical but diplomatic tone, and gives a vivid picture of the occasional hellishness of tropical field work.
The impression I get from it (and it is to my discredit that I hadn’t read
the relevant papers already, but in my defense they were on
lingbuzz,
which to anyone not interested in generativism is as
interesting as the theological debates of the 7th Day Adventists) is of the
Pirahã as the apogee of anti-intellectualism; when other language
communities have had number systems that lacked in the fine differentiation
of most western languages, they were happy to pick them up, but for the
Pirahã the difference seems to have been a social pressure not to.
Word of the day: قبالة qabālat (v.n. of قبل), in Persian qabāla, qubāla: surety, contract (especially of bargain and sale); in Spanish as la alcabala, an historical sales tax; in Hebrew as kabala קבלה, meaning invoice/receipt.
FotC … Luckily, I’m not attending a college to drop out of … Çin 1st of June, 2007 POST·MERIDIEM 09:09
Sunday I spent mostly listening to Flight of the Conchords, a fine, fine New Zealand band who have, for example, this piece on Youtube: Business Time. Besides that, not doing anything constructive.
Saturday, I came across a German mattress shop with a large sticker ‚Preiѕhіt‘ on its front window.
Since then, meh.
I’ve been pasting interesting links (interesting to linguistics nerds, that
is) to #linguistics
on irc.freenode.net
on the
invitation of Francis
Tyers for the last few weeks now, and an agreeable corner of IRC it is
too. Haven’t been conscientiously logging, so can’t really refer to any of
them here that are not already posted to http://del.icio.us/aidan/
Word of the day: Чин is Tajik and Turkic for ‘China’; not unrelatedly, this was the term Marco Polo used for the country, the word ‘China’ itself being introduced to Europe by other authors.
Last comment from Ibrahim on the 16th of July at 13:33
Tajik people also use Хитой for "China"
Last comment from on the 7th of August at 6:52
dfdf