Tuesday, 2008-06-03

*** skxpl has quit IRC00:18
*** pmiller has quit IRC01:05
*** presroi has joined #cc01:23
*** vostorga has joined #cc01:35
*** vostorga has left #cc01:35
*** rejon1 has joined #cc02:13
*** rejon has quit IRC02:28
*** presroi has quit IRC02:31
*** hdworak has joined #cc02:43
*** rejon1 has quit IRC03:09
*** pmiller has joined #cc03:22
hdworakstumbled upon this bug: http://groups.google.com/group/beautifulsoup/browse_thread/thread/c7d096e68ff4521c03:27
hdworakso watch out03:27
*** pmiller has quit IRC03:27
*** rejon has joined #cc03:40
*** skxpl has joined #cc03:51
*** rejon1 has joined #cc04:01
*** rejon has quit IRC04:01
*** gdsf has joined #cc04:29
*** gdsf is now known as bring204:30
bring2hello, im curious if i would like to 1) allow anyone to use & remix the work with attribution (http://creativecommons.org/licenses/by/3.0/) and also 2) allow non-commercial usage with no attribution, is there a good way to do this through CC licenses?04:31
*** rejon1 has quit IRC04:41
*** BobChao has left #cc04:52
hdworakbring2: I guess you can simply go for a dual license04:59
*** ankitg has joined #cc04:59
hdworakhi Ankit04:59
ankitgHi hdworak05:02
bring2hdworak, thanks but what other license? all of them include "attribution" except public domain :/05:05
hdworak"All of our licenses require that you give attribution in the manner specified by the author or licensor."05:07
hdworakyou are right05:07
* hdworak has found a bug in utidylib and is not happy about that at all05:09
bring2hmm, can you recommend some other licenses then? id like to include a standardized license, but would also prefer to allow non-commercial works to skip attribution (if they so choose)05:10
bring2what's utidy?05:10
hdworakor Ubuntu package python-utidylib05:11
hdworakok, issue resolved05:34
hdworakbring2: I can't think of anything, I'm not a lawyer, I'm sorry05:35
*** bheekling has joined #cc05:35
hdworakwhat is it that you actually want to license?05:36
hdworakmusic, software, or whatnot?05:36
bring2blog posts, some technical guides05:45
*** bheekling has quit IRC05:47
hdworakthen I dunno really05:49
hdworaksay you would like to have cc-nc w/o sa (if one existed)05:50
hdworakand w/o by05:50
hdworakdo you mean w/o by and w/o sa?05:50
hdworakif so, what's the problem with copying your work to a non-commerical work which is then released into public domain?05:51
hdworakso that another project can use it commerically05:51
hdworakor do you mean nc-sa ?05:51
hdworakbut w/o by?05:51
*** hdworak has quit IRC06:23
*** hdworak has joined #cc06:54
bring2well, public domain means that anyone could use it commercially, without attribution06:58
bring2but, i'd prefer commercial use requires attribution, while letting non-commercial usage forego attribution (if they choose)06:58
bring2<hdworak> say you would like to have cc-nc w/o sa (if one existed) - that sounds about right :)07:01
*** BobChao has joined #cc07:04
ankitgbring2 you need a CC BY license for those who want use your work for commercial purposes and a PD for the non-commercial purposes ...07:07
bring2ankitg, hmm but how can you can release something to only non-commercial PD? if it's PD, that means someone could use it for commercial purposes right?07:08
hdworakthere is no non-commerical PD07:08
hdworakso as I was saying, if you have non-commerical license but not share-alike07:09
hdworakI take 100% of your work to an uncommerical project named Project107:09
hdworakand license this project as PD07:09
hdworakthen I take Project1 and do Project2 based on that, which is commercial07:10
hdworakonce you go public domain with your project, you can't sue me for anything, because you've withdrew all of your rights to your work07:13
bring2sorry im not so familiar with these terms, but in this case Project1 could be used for commercial purposes, without attribution, which is not something i want to allow07:13
hdworakimho Project1 not, just, if it's released under PD, its remixes07:14
hdworakbut that's just how I understand the lack of share-alike, I can be plain wrong here07:15
hdworakplus I don't know many alternatives to cc licences (when we do not count software licences)07:15
hdworakyou can secure all the rights (copyright), you can withdraw all the rights (pd)07:16
bring2in the example, Project1 is released as PD, but i want to require any commercial users to include attribution07:16
hdworakone alternative is GNU FDL07:16
bring2yeah that is where i know CC licenses from, i think any software license would be ok as long as it includes the terms im interested in, just haven't found one07:16
hdworakonce Project1 is released as public domain, you cannot require anything about it07:16
bring2yeah that is why your example doesnt work for me :(07:17
hdworakit's like you never ever owned it in the first place07:17
bring2ok cool ill check GNU FDL, you think that will be appropriate for what i'd like?07:17
hdworakno, absolutely not07:18
hdworakI just mention it as an alternative07:18
ankitgI see the problem ... you need a CC-BY license for your data but want to give the option to forgo the attribution requirement for non-commercial uses ...07:18
hdworakin general07:18
bring2hdworak,  hehe ok well ill still check it out07:18
hdworakthe whole problem here is whether you make the license viral or not07:18
bring2ankitg, yep that is it exactly07:18
ankitgI am not sure how that clause can be made possible ... al the CC licenses offered now require attribution ...07:19
bring2ankitg, yeah it looks that way, its understandable but that means i think ill have to find some other licenses besides CC :/07:19
hdworaklet's make it clear07:20
hdworakyou have a work you want to license (Project0)07:20
hdworakanother person wants to make a public domain work based on Project0, the work of this person is not commerical07:21
hdworaklet's call it Project107:21
hdworakwhat do you want Project1 to do?07:21
bring2hmm well there is not anyone else involved right now, but i want people to have the option of using Project0 (of which i am the author) in either a Non-Commercial purpose, in which case they may choose not to attribute my original Project0, or for Commercial purposes, in which case they must include attribution for my original Project0 work07:23
* bring2 *type type type*07:23
hdworakI understand, I'm asking about Project107:26
hdworakwhat do you want the author of Project1 to do?07:26
bring2actually none of it should be released into the public domain, just want to have pretty open licenses07:26
hdworakin the case I've just described07:26
hdworakI understand that you do not release it under PD07:26
hdworakI'm saying non-commerical Project1 goes PD07:26
bring2idk what would Project1 be?07:26
bring2i don't know07:28
hdworaklet's say a Web page entitled07:28
bring2ok, so you mean what portions could Project1 use if they wanted to release as PD?07:28
hdworakno, I'm saying the problem is you do not want share-alike07:29
hdworakyou just want non-commerical07:29
hdworakso someone does non-commerical work and does not release it under the same license, but less restrictive license or even w/o one (PD)07:30
hdworakand they do not attribute you, 'cause you didn't ask for that07:30
hdworakis that right?07:30
hdworakthen how on Earth a third guy doing commerical work based on that non-commerical work can know you've created Project0 - the origin of it all - in the first place?07:31
hdworakif you didn't require attribution in Project107:31
bring2hmm ok well, Project1 could not release my work under PD, because im still holding the copyright07:35
bring2they have the right to use the work from Project0 (without attribution) only for non-commercial purposes07:37
hdworakso let's say they say07:37
hdworakyou can use Project1 for whatever you like unless it's commerical07:38
hdworakis that valid w/ the license of your project?07:38
bring2hmm i guess that is the Share-Alike clause, i do not want to require it, but of course they cannot grant any privileges which have already been disallowed07:39
hdworakwhen I've asked you in the very beginning, you said you do not want share-alike07:40
bring2so, Project1 can use any license they want, but they could not release in the Public Domain because they do not hold those rights07:40
hdworakif it's nc-sa, then it's no problem07:40
bring2they can use the same license or not, it doesn't matter to me :)07:41
hdworakif it's just nc, then I do Project1 based on that with the following license: "you can use this for any purpose and under any license, as long as this is non-commerical; in such cas you do not need to include a license yourself and attribute any of us"07:42
bring2nc-sa would fine, but in those case i don't want to require them to attribute me07:42
bring2yah simple non-commercial would be fine too07:42
hdworakthen comes Project2 which is also non-commerical, takes all the code from Project1 and releases it into PD07:42
bring2maybe there is no need, but i'd prefer some license that is written up somewhere i can refer to :)07:43
hdworakthen comes Project3 which takes PD code and uses it commerically07:43
bring2umm Project2 does not have the right to release it under PD07:43
hdworakthey could do anything with the code if it's for commercial purposes07:44
hdworakwe're talking about non-by non-sa case07:44
hdworakjust pure nc07:45
bring2what's by?07:45
hdworakif it's for = if it's not for07:45
bring2ok yeah, they can freely use it, but that does not mean they have the right to grant PD license to the material07:45
hdworakwhy is that?07:46
hdworakare you sure you're talking about Project2 not Project1?07:46
bring2im still holding the copyright, just allowing its use under certain terms, at what point was the work released into the public domain?07:46
hdworakProject 0 (your license)->non-commerical Project 1 (nc non-by non-sa) -> non-commerical Project 2 (pd) -> commerical Project 307:47
hdworakProject 2 guys don't even know you exist07:47
hdworakbecause you didn't require -by from Project 107:48
ankitgI believe there is way to have two different licenses for commercial and non-commercial purposes ... and I understand you want a CC-BY license for the commercial uses of your work ... I can't think of anything other than PD which is less restrictive than CC-BY ... though I am not a lawyer ...07:48
bring2neither Project1 nor Project2 have the authority to allow usage which is not permitted according the terms of the Project0 release license07:48
hdworakfor Project2 there is no Project0, right?07:49
hdworakbecause there is no-by and no-sa07:49
hdworakif you want remixes to inherit your rights, you use -sa07:49
hdworakif you don't want them to inherit your rights, you go no-sa07:50
hdworakyou did nc, no-by, no-sa07:50
hdworakhow is P2 suppose to know about P0?07:51
hdworakplease explain to me :)07:51
bring2well even if Project1 does not use the exact same license, they cannot grant certain privileges, as copyright ownership is still maintained by Project007:51
bring2idk that is their problem :)07:51
hdworakP1 is in full compliance with your terms07:51
hdworakthey released nc work07:51
hdworaknow they license their work as pd for non-commerical use only07:52
bring2ankitg, thanks i guess thats what im looking for07:52
*** tvol has joined #CC07:52
hdworakby pd here I mean "anything goes as long as your work is non-commerical"07:52
bring2P1 can use the content, but regardless of -SA they are not allowed to release the work as PD07:52
hdworakby pd here I mean "anything goes as long as your work is non-commerical"07:53
hdworakcan P1 be released on such terms?07:53
hdworakbecause your license was only nc07:53
bring2well im not sure what "anything goes" would include, but my point is that P0 does not release the work into PD, therefore P1 cannot either07:53
bring2the license allows non-commercial usage, but does not mean that anyone can claim authorship or release the work into PD when it has not been07:54
hdworakanything goes = "you are hereby granted the right to use this work for any purpose and in any medium as long as your work remains non-commercial"07:54
hdworaklegal for P1?07:54
bring2yeah if P1 were to try to place material from P0 into the public domain, that would be illegal, since P0 still owns copyright07:55
bring2im not a lawyer either btw :D07:55
hdworakI'm sorry I've used the term PD when it comes to P107:55
hdworakplease replace with "you are hereby granted the right to use this work for any purpose and in any medium as long as your work remains non-commercial"07:55
hdworaklegal for P1?07:55
hdworakthis would be the least restrictive nc license I could think of07:56
bring2yeah so this would the license for the P0 material?07:56
hdworakyes, but I'm introducing P1 to show you how we get rid of your license in the midstep07:57
hdworakbecause P0 was dual-licensed, P1 is not07:57
bring2or for the P1 material? i guess the point is, P1 cannot "release" P0 material under any license, since they do not own the copyright, they can merely use it07:57
*** kristallpirat has joined #cc07:57
hdworakI thought you want nc to be transferable07:58
hdworakbut now it's clear you do not allow any P2s if P1 uses your work w/o attribution non-commerically07:59
hdworakI thought you thought of P0 nc license as "you are hereby granted the right to use this work for any purpose and in any medium as long as your work remains non-commercial"07:59
hdworakand you seemed to confirm this by "yeah so this would the license for the P0 material?"08:00
hdworakbut you want nc-usage of your work to be w/o the right to further remixes08:00
bring2yes, they are free to use it, but cannot grant license privileges which have still been retained in the P0 license08:00
hdworakthen this is share-alike08:01
hdworakwhich I'm trying to explain from the very beginning08:01
hdworak1st question08:01
bring2well, as i understand it, a licensee is prohibited from granting additional right by the nature of copyright law, regardless of a SA clause08:01
ankitgHugo, I think bring2 just needs to use a CC-BY ... he wants to be extra nice by saying attribution is optional in cases of non-commercial usage ... which I think can just be mentioned on the blog where he links to the license deed ...08:02
hdworakif someone does non-commerical work (P1) under your license (nc w/o by), can this work be a subject of further work by someone else?08:02
bring2but i really don't mind including a SA clause, as long as its still possible to allow what i want08:02
hdworakyes, all I'm saying is that if there is no -by08:02
hdworakwhich is a subject of all cc licenses08:02
hdworakthen P2 will never know about you08:03
hdworakabout your dual license for P008:03
bring2ankitg, that might work fine, just trying to find a bit more standardized license than stating this in my own language08:03
hdworakall they know was "this was meant to be used for non-commerical purposes"08:03
ankitghdworak, true ... which is why BY is a feature of all cc licences ..08:03
hdworakankitg: exactly08:03
hdworakand all I'm stating is that it's there for a good reason08:04
hdworakif you release under nc w/o by and w/o sa, there is no you08:04
ankitgalso because CC tried launching some licenses without the BY clause and there was very little adoption ... so they dropped them ...08:05
bring2well P1 could use the content without attribution and without releasing under the same license, but still would not have the right to change the license for any P0 material08:05
hdworakbring2's P0 - dual license (by OR nc)08:05
hdworaknon-profit's P1 - nc08:06
hdworaknon-profit's P2 - pd08:06
hdworakfor-profit's P3 - all rights reserved08:06
hdworakbring2: when P1 allows the reuse of their work, they do not need to include your license or attribute you08:07
hdworakbecause you've chosen nc only (w/o sa, w/o by)08:07
hdworakand they comply - their work is nc08:07
bring2p2 cannot release NC work as PD08:07
hdworakwhy's that?08:08
hdworakwhere is that stated?08:08
bring2they do not own the rights on P1, they are only allowed to use it under terms of P208:08
bring2my understanding is you do not have the right to release work as PD unless you are the copyright owner08:08
hdworakthey've complied to P1 license - "anything goes as long as it nc"08:08
hdworakso they've releasing their nc work as pd08:09
hdworakwhat's illegal about this?08:09
bring2well the P0 license grants usage, but P1 does not have the rights for P0 material, so if the P1 release includes P0 material they cannot grant any rights which are reserved by P008:10
hdworakbut it ain't share-alike :)08:11
bring2likewise P2 can use material from P1 under P1's license, but that does not include the right to release P1 material into the public domain08:11
hdworakthis is our assumption from the very beginning08:11
bring2those rights are reserved inherently by copyright law, even if P0 does not explicitly require share-alike08:11
hdworakif it was nc-sa, then you've got a perfectly valid point08:12
bring2according to my understanding of US copyright law08:12
hdworakok, then cc sa doesn't make sense08:12
hdworakoh no08:12
hdworakI get it now08:12
bring2if i write a book, and release it for non-commercial usage, someone cannot put it into a story collection and release the whole thing as public domain, even if i didn't original require share-alike08:13
hdworakn/p, this conversation is over, I've understood the purpose of sa08:13
hdworakthanks for helping me understand it08:13
bring2frankly, i'd be okay with the share-alike requirement, but still can't figure out how to allow commercial use with attribution, and non-commercial use without attribution using (using CC licenses)08:14
bring2lol sure np, working out all the different possibilities helps me figure out what i want alot better too :)08:14
hdworakit simple - using just cc licenses pool you can't08:17
hdworakthey all require by08:17
hdworakyou would have to explicitly disclaim that right08:17
bring2yup thats what im thinking, do you know if there are any good websites that compare popular content sharing licenses? something similar to CC licenses, with that small change would be good for my use08:18
hdworakfrankly if you do not count software licenses, I can think only of pd, cc, and gfdl08:19
bring2that might be the best option, trying to write up my own license is too cumbersome, and people are hesitant to use it unless they recognize the license08:19
hdworakif you're not a lawyer do not even think of writing your own license08:19
bring2some software license might work ok, there shouldn't be much difference between a computer program source file and a blog post08:19
bring2haha yeah, thats what im trying to avoid :)08:20
*** ankitg has quit IRC08:22
bring2cool thanks, ill check these out08:22
bring2they take alot longer to read than CC's nice summaries :D08:23
*** tvol has quit IRC08:26
*** tvol has joined #CC08:29
conleybring2: There's also art libre08:44
*** sama has joined #cc08:44
hdworakrecommened by FSF :)08:48
*** BobChao has quit IRC08:54
*** tvol has quit IRC08:56
*** sambhav has joined #cc08:59
hdworakhttp://ben.adida.net/presentations/www2008-rdfa/#(26) says "license is a reserved HTML keyword" - since when?09:00
hdworakcan't see it listed here http://www.w3.org/TR/REC-html40/types.html#type-links09:00
*** sambhav is now known as GenX09:00
*** GenX is now known as Sambhav09:01
*** tvol has joined #CC09:04
*** greg-g has joined #cc09:09
*** tvol has quit IRC09:18
*** tvol has joined #CC09:19
*** jordon has joined #cc09:23
*** Sambhav has quit IRC09:28
*** nathany has joined #cc09:33
*** encompass has joined #cc09:34
encompasshello everyone...09:34
encompassDoes CC have a standard on how to tag your rss feeds or video to show it's CC licence?09:35
encompassI am working on making a reader for this for Miro09:35
*** sama has quit IRC09:40
*** sama has joined #cc09:41
bring2idk, but Miro is great :)09:42
*** jordon has left #cc09:48
*** jgay has joined #cc09:49
encompassbring2: thanks your cool too :D09:50
bring2lol, not as cool as Miro :D09:51
bring2but thanks :)09:51
nathanyencompass: see http://wiki.creativecommons.org/Syndication09:53
nathanythere are specs there for RSS, RSS 2 and Atom09:53
* encompass hugs nathany "dude thanks"09:53
*** paulproteus has joined #cc09:53
nathanywe did a patch for Miro a while back that implemented at least part of this (exposing license information from the feeds)09:53
nathanyit's in their builds now09:53
* paulproteus waves quietly.09:53
nathany(and has been since 1.1 or something like that)09:53
nathanyhola, paulproteus09:53
hdworak http://ben.adida.net/presentations/www2008-rdfa/#(26) says "license is a reserved HTML keyword" - since when?09:54
hdworakcan't see it listed here http://www.w3.org/TR/REC-html40/types.html#type-links09:54
nathanyhdworak: perhaps he was referring to XHTML?  you'd have to ask him :)09:55
hdworakis there a difference in link types for XHTML?09:55
hdworakI thought it's just a reformulation of HTML as an XML application09:55
hdworakthere is no word license in XHTML 1.0 recommendation09:56
hdworakand apparently no such word in HTML 4.01 either09:57
hdworakok, so I don't know what Dr. Adida had on his mind09:58
nathanyhdworak: http://www.w3.org/1999/xhtml/vocab/10:00
hdworakoh, I see IvanHerman is on IRC10:00
hdworakok, but that's not pure XHTML10:01
hdworakbut that XHTML+RDF chimera?10:01
nathanywhat are you talking about?10:01
hdworakXHTML 1.0/1.1 recommendation10:01
paulproteusnathany, He's talking about the XHTML+RDFa DTD10:01
paulproteusBut no, this is not that; this is the XHTML vocabulary.10:01
paulproteusI think10:02
hdworakit's developed by XHTML 2 WG10:02
* nathany returns to thinking about things that he cares about atm10:02
*** stevel has joined #cc10:03
hdworakthanks for the link :)10:04
hdworakis there a list of all these means (present and deprecated) of embedding info about cc licences? at least in HTML?10:05
*** ajbrooks has left #cc10:06
hdworakRDFa, DC's dc:license, RDF comments embedded in the XHTML code, RDF comments embedded directly in "head" or "body" elements, external RDF files10:08
nathanyhdworak: see http://wiki.creativecommons.org/Extend_Metadata10:09
hdworakif we have rel="license" in the XHTML code, does it always refer to the license of this particular document or it can refer to a license of something else that is mentioned on the page?10:14
nathanyhdworak: it depends ;)10:20
nathanyseriously, though, when processing as RDFa, you need to determine the context10:20
nathanyi don't recommend dealing with this yourself, just use an existing rdfa parsing library like librdfa or rdfadict10:20
nathany(note the latter is somewhat out of date)10:21
hdworakactually it's 0 even w/o lang:py10:25
hdworakgonna look for a tutorial/example on Google10:26
paulproteushttp://rdfa.digitalbazaar.com/librdfa/trac/browser indicates there are Python bindings.10:26
hdworakyes, and they provide .deb packages10:27
hdworakI've already installed that10:27
hdworakthey call it rdfa as package name10:28
*** Sambhav has joined #cc10:29
*** presroi has joined #cc10:30
*** jordon has joined #cc10:30
hdworakso when parsing RDF, all we'll get are those triples?10:35
hdworakand then if predicate is license or DC:license and object is a URL somewhere in the cc domain then we care?10:44
hdworakon that Wiki page http://wiki.creativecommons.org/Extend_Metadata#Defining_dc:rights10:48
hdworakshouldn't it be dc:license instead of cc:license  (third code block)10:48
*** tvol_ has joined #CC10:59
*** tvol has quit IRC11:00
*** sama has quit IRC11:01
*** sama_ has joined #cc11:01
hdworakis the only thing we are interested in the hyperlink to the license?11:13
hdworakthere are no more metadata about licensing itself in the document11:13
hdworakjust the link to cc ?11:13
*** rejon has joined #cc11:15
*** Sambhav has left #cc11:18
hdworakand we cannot rely on the "cc" or "dc" prefix, but we have to check to which namespace URI it does correspond, right?11:22
hdworak'cause someone can name xmlns:foobar="http://validuri" and then foobar:license11:22
paulproteushdworak, Right, that's part of the point of XML (and RDFa) namespaces.11:23
paulproteuslibrdfa will handle it, though.11:23
hdworakoh, finally11:24
paulproteusI went to take a nap.11:24
hdworakI thought I have exceeded the allowed questions per week limit11:24
*** sama_ has quit IRC11:24
hdworakbut anyway, the above are serious questions11:25
hdworakwe parse RDF (external or embedded) and we get triples in the result11:25
hdworakand from that triples, we are interested only in those that contain http://cc.org/something as the object11:26
hdworakis that true?11:26
paulproteusNo, I think there may be a few others.11:26
paulproteusWe should display dc:title for example if we find it.11:26
paulproteusBasically, whatever the license chooser generates.11:26
*** ajbrooks has joined #cc11:27
hdworak1. we scan for objects containing creativecommons URI11:28
hdworak2. if we find such a triple, we check the predicate - whether it matches license or dc.license or dc.rights11:28
*** jordon has left #cc11:28
paulproteusdc.license might be rdf:sameAs cc:license, I'm not sure.11:28
hdworak3. if it does, we know that the subject has a CC license11:28
paulproteusdc:license, that is.11:28
hdworak4. then we can present other data (found in other triples) that belong to the subject with a human-readable info about the license11:29
hdworakwe can present such data (=other triples with the same subject) in a table11:30
hdworakor do you mean something else?11:30
*** bovinity has joined #cc11:37
paulproteusHazzah, our HTML sucks.11:43
paulproteusnathany, Do you have admin access to our Google apps?11:43
paulproteusCan you do me a favor and log in and change my email redirect so that it goes to paulproteus+cc<at>acm.jhu.edu instead of where it goes now?11:44
nathanyyes, IAM11:45
hdworakpaulproteus: you said you don't believe in e-mail address obfuscation11:45
*** tvol_ has quit IRC11:46
*** tvol has joined #CC11:46
paulproteusI wonder what "IAM" means.  I presume that's an ACK of some sort.11:47
nathany"in a minute"11:47
paulproteusOh, okay.11:47
nathanypaulproteus: does the current address look sort of... obfuscated?11:47
paulproteusCool, no huge rush.11:47
nathany(just making sure i change the right thing here)11:47
paulproteusIt looks sort of... uniqified, let's say.11:47
nathanypaulproteus: done11:48
hdworakmisspell "URI of XHMTL file"11:49
paulproteushdworak, Desperate times call for desperate measures....11:49
hdworakthen in the .tar.gz source they have wrong link to the license11:49
hdworakI'm talking about http://www.w3.org/2007/08/pyRdfa/11:49
hdworakI'll contact Dr. Herman when he's online11:50
nathanyhdworak: what's mispelled?11:50
nathanyit *does* expect a URI11:51
hdworakXHMTL (HM)11:51
nathanyah, i missed that11:51
* nathany goes back to java land11:51
hdworakpaulproteus: help help11:51
paulproteushdworak, Yes?  I must have lost your questions in the noise.11:52
hdworakI'd like to understand this process of parsing with the yet-to-be-done validator11:53
hdworaksomeone uploads/links/pastes an XHTML page11:53
hdworak1. we extract RDF/RDFa triples, yes?11:53
hdworak2. we scan for objects (of these triples) that contain the link to a valid CC license, yes?11:54
paulproteusLet's say we look through all the triples, and if it *should* link to a CC license, we check that it actually does.11:54
hdworaklike http://creativecommons.org/licenses/by/3.0/us/11:55
paulproteusRight, but if it links to http://creativecommons.org/broken/link/ instead you should flag that.11:55
hdworakwhat if it links to cr3ativecommons.org11:55
paulproteus...flag it...11:56
paulproteusAlso you should look for the other metadata embedding standards, like the old RDF in a comment, and show how to upgrade them.  That much is in the proposal.11:56
hdworakas I said, we extract RDF/RDFa triples11:56
paulproteusYes, I'm saying you should do more than just that in the long run.11:56
hdworakwith the methods described there11:59
hdworak"in the head" RDF, "in the body" RDF, "data: URL" RDF, "linked external" RDF, "inside a comment" RDF12:00
hdworakonce we extract RDF from this different means, its parsed in one and the same way?12:00
hdworakor do these methods have different RDF structure or something12:00
hdworakafter extraction12:01
paulproteusNo, same structure12:04
hdworak<cc:license rdf:resource="http://flf.org/licenses/whiteHouseLawn" />12:05
hdworakshould that be dc:license ?12:05
paulproteusProbably not.12:07
paulproteusMaybe there is no dc:license?12:07
paulproteusOh, okay.12:08
hdworakthey even link to cc12:08
paulproteusYou'd have to check, but it might be rdf:sameAs cc:license.12:08
paulproteuscc: also has a license anyway.12:08
hdworakhow about their second example12:08
hdworaklicense="Licensed for use under Creative Commons Attribution 2.0."12:09
hdworakOmniscience Validator?12:09
paulproteushdworak, Here's how I feel about this discussion.12:09
paulproteusAre we writing the spec for your validator?12:09
hdworakno, I'm trying to understand what to do12:09
hdworakI do not have the experience with RDF/RDFa/cc licensing aside for a very basic usage (my home page)12:10
hdworakI'm trying to break the task into smaller chunks12:10
paulproteusLet's write this up on your wiki page then.12:10
hdworakbut you have a Wiki page on this topic already12:11
paulproteusI'm going to bike to the office.12:11
paulproteusCan you do me a favor and write your questions on that wiki page while I'm out?12:11
hdworakwhich wiki page?12:11
hdworakmaybe the soc one?12:11
*** paulproteus has quit IRC12:14
*** paulproteus has joined #cc12:39
hdworakdo not type with your feet12:40
hdworaknice 112:40
hdworakpaulproteus: I'm writing a summary of tools on the Wiki atm12:40
hdworakpaulproteus: I'll write the questions soon after12:40
hdworakI'd be grateful if we could discuss them today12:41
paulproteusWe probably will have time.12:41
*** rejon has quit IRC12:43
*** ankitg has joined #cc12:47
paulproteusHey ankitg.12:49
ankitgMorning paulproteus ... I haven't really had the time to look into the S3 tools to my hearts content, have only tried two so far [including the suggested s3sync.rb] ...the s3sync seems to require an OpenSSl library which I can't seem to find ...12:49
paulproteusIf you paste me the errors, I think I'll be able to help.12:50
paulproteusBTW, http://www.macosxhints.com/article.php?story=2008020123070799 seems to think it has all the s3 tools you'll need.12:50
ankitgokie, let me fire up the mac and get back to you with the error message ... though from what I recall it basically said "the environment is not setup"12:51
paulproteusJust don't store a backup of your Mac on our S3!12:51
ankitgI will try my best not to mess things up ...12:51
hdworakTHAT OpenSSL library?12:51
ankitghdworak is that for Ruby? ... I was looking for a gem actually ... ?12:54
*** sama_ has joined #cc12:57
ankitgpaulproteus: the exact error message I get from s3cmd upon giving it listbuckets as a parameter is "You didn't set up the environment variables"12:57
hdworakI dunno, I have not used Ruby12:57
ankitgthx hdworak ... let me give that a try ... though I think I need to download the source before I can "make" it ...12:59
ankitgyep as expected, no such directory ... I need to download and put it there ...13:00
ankitglooking at this though, makes me realize there may be no gem, but a source file for it which I would need to download and compile ...13:01
*** rejon has joined #cc13:02
hdworakgood luck13:03
ankitghdworak: I downloaded the latest version from the openSSL site ... the archive won't even load O_o ... anyways, I'll try the tool asheesh just mentioned, though like all other S3 tools I've found so far, it's backup centric ...13:06
hdworakMark Pilgrim also wrote Dive into Accesibility13:10
hdworakI've also forgot its author13:10
hdworakvery impressive13:10
*** paulproteuss has joined #cc13:14
paulproteussToday's a low-tech day for me.13:14
rejonbovinity: what is so great about squirrelfish?13:15
bovinityrejon: apparently 3 times faster than the current JS engine in webkit13:16
bovinityrejon: i mean 1.6x. http://webkit.org/blog/189/announcing-squirrelfish/13:17
ankitgpaulproteuss: thx, that was awesome ... this download had just the app I needed ... wonder why they don't ship it with the version on the authors' website ... anyways, I see the whole list of logs now. (-:13:17
rejonis there gears for webkit/safari yet...must be soon, those google engineers are gaga over webkit, not ankit13:17
paulproteussankitg, Okay, great.13:18
paulproteussankitg, So don't download all 100 gigs; download a few files and take a look, and start to figure out what you need.13:18
paulproteussIf you want to do batch computing on the logs, we can open up an Amazon EC2 instance.13:18
ankitgit says 25887 objects, 319.191 GB in the ccommons bucket ...13:19
paulproteussWell, yeah, all 319 gigs I mean. (-:13:20
ankitgI would like to have a local copy, coz I will be traveling to India and i don't trust the internet speed there ...13:20
paulproteussDo you really have room for all 300 gigs?13:20
ankitgI was planning on getting myself an external HDD and making a copy ... they are cheap now a days ...13:20
ankitgand it would speed things up and would give less room for messing up the originals ...13:21
ankitgI believe in redundancy when it comes to playing around with stuff ...13:21
ankitgI have 250+ GB free on an external right now ...13:23
paulproteussnathany, OK if ankitg does that?  Any idea what it would cost?13:24
ankitgand i think I don't need all the 319 GB, do I ?13:24
nathanypaulproteuss: does what; download it all?13:25
paulproteuss320GB download and 2000 PUT/GET requests (that is a total guess) make for about $5013:25
* paulproteuss shrugs13:25
paulproteussankitg, No, that's the thing13:25
paulproteussYou won't need all 300 gigs.13:25
nathanyi don't really understand *why*  but it's fine as a one-shot deal13:26
*** grahl has joined #cc13:26
paulproteussankitg, Well there you go, ^^ (-:13:26
nathany(my complaint wrt not understanding implies that you should figure out why you need to first ;) )13:26
paulproteussYou'll find that you have lots of data if you just grab a random sample of, say, 1% of them.13:26
ankitghmmm ... let me see if I can find an easy way to filter out what is required and what's not ... coz I get the impression that there's a whole lot more in there than I really need, but then again, my project doesn't have to limited to those 4 logs (-:13:28
paulproteussYes, basically, spend some time figuring out what the data is.13:28
paulproteussIf you find it easiest to do that by downloading the whole data set and computing on it locally (which I can understand), say so and I'll say go ahead.13:30
paulproteussBut you'll find that it's divided up into sections, and that if you understand one file in a section you understand them all.13:30
ankitgokie, the data is nicely segregated into folder depending on which type of a log it ... I think I'll take one from each folder as a sample and I'll know what would be good to have then ... (-:13:30
paulproteussPrecisely. (-:13:30
ankitgyay, making progress! (-:13:31
*** urbanmonkey|work has quit IRC13:31
ankitgokie, I need wrap up some other projects before I leave for India, it's good I have access to this now, once I've figured out which sections are relevant, would it be for me to make a local copy?13:32
paulproteussMake a local copy of what you need, and what you might find useful.13:36
paulproteussDon't do it unnecessarily, but if there's a reason, then do.13:36
paulproteussnathany, Where is the CC Nagios again?13:37
nathanyon a5 and a613:37
nathanyat /nagios2, IIRC13:37
ankitgyep, I'll let you know how much I'll be copying before I copy anything so you are aware ...13:37
*** kristallpirat has quit IRC13:38
paulproteussankitg, Sure - the key is, I'm not asking you to ask permission, just to use some judgment.13:39
paulproteussThere's no point holding yourself totally back.13:39
ankitgthx, i'll take only what I need ... just want to play it safe and do whatever I need to do on a copy ...13:41
ankitgplus I will let you know how much I am taking so you know how much to expect in "damages"13:42
*** sama_ has quit IRC13:47
hdworakok, I've added two new sections to the article13:51
hdworakRelated Web applications and Web framework and libraries13:51
paulproteussnathany, Okay if I reboot a8 (first of a few)?13:55
paulproteussAll the web sites have transitioned off; there are 0 hits to Apache lately.13:55
nathanypaulproteuss:  i don't think i'm logged in, am i?13:55
nathanyno reason not to afik13:55
nathanyafaik, that is13:55
paulproteussOkay, I call lock on a8 then.13:55
paulproteusshdworak, Can you write up last paragraph on that page as bullet points?  Then we can move them into your timeline.13:56
paulproteuss...unless that's not what you need?13:56
paulproteussBasically I'm confused as to what you need from me.13:56
hdworakbut the timeline already mentions that13:57
hdworakit's just that we discuss the tools in the last paragraph13:57
hdworakI am to ask you some design-related questions13:57
hdworakI just didn't write them up yet13:57
hdworak(I said I'm gonna write about tools before writing them, because this is what I declared yesterday)13:58
paulproteussOh, okay, great.14:07
rejonwhoa, wikia search update is pretty cool14:07
hdworakpaulproteus: have I missed some tools in the Wiki article?14:09
*** kristallpirat has joined #cc14:09
hdworakccRdf most probably14:10
hdworakin this article:14:10
hdworakYou can download the tarball here.  <--- there is no link14:11
hdworak"The current implementation is available is"14:11
hdworakand rdfExtract14:11
hdworakI might have missed it, too14:12
*** tvol has quit IRC14:13
hdworakthe article http://wiki.creativecommons.org/RdfExtract14:13
hdworakcontains a dead link (current source)14:13
paulproteussnathany, Where has that gone?  That would be very useful for hdworak actually.14:16
paulproteuss"that" == "rdfextract.py"14:16
paulproteussBTW, you could import your old darcs projects into git (or ask me to).14:17
hdworakI've got the source code of ccValidator which has to include it14:17
paulproteusshdworak, Oh, okay.14:17
paulproteussThat's good at least.14:17
hdworakbut I'm just reporting misspells and broken links14:17
hdworakas it's the summer of spellchecking14:17
paulproteussWell, it's the summer of wikis!14:17
nathanypaulproteuss: yeah, i think the repos is currently on a box in a storage container somewhere in chicago14:18
paulproteussnathany, That's exciting.14:18
paulproteussYou don't have a darcs clone of it anywhere, do you?14:18
paulproteuss(or whatever darcs calls a full-history checkout)14:18
nathanyyeah, it would be if it were my unit (it's my friend Jeremy's)14:18
nathanyi may have one on my laptop @ home14:18
nathanythat code predates my full time employment @ CC14:18
paulproteussI realize.14:18
hdworakwe might call it a predator then14:18
hdworakpaulproteus: are you going to answer my questions on Wiki, too?14:23
hdworak'cause if so, it's gonna be a bit official, like we had no IRC14:23
hdworakif not, it might be better to pastebin these questions somewhere, as they will remain unanswered14:23
paulproteusshdworak, I haven't noticed questions on the wiki.14:25
paulproteussAre there some?14:25
paulproteussFor questions whose answers need to be organized, I prefer to answer them on the wiki.14:25
paulproteussBut for just discussion stuff, IRC is fine.14:25
paulproteussSomehow your reference to a pastebin confuses me.14:25
paulproteussnathany, I'm going to reboot a8 into an amd64 userland now.14:26
paulproteussWish me luck.14:26
nathanygood luck14:26
hdworakI haven't written them yet14:26
hdworakI do not have questions requiring organised answers, just a couple of short yes/no questions14:27
paulproteussOh, okay, then ask me again and I'll focus this time.14:27
hdworakwhat if someone misspells the predicate? xc:license dc:Lights ?14:28
hdworakor are we doing vAIdator?14:30
*** presroi_ has joined #cc14:39
*** stevel has quit IRC14:45
hdworak"in HTML/XMTML"14:45
paulproteussOkay, hi.14:45
hdworakanother misspell14:45
hdworakhi, paulproteus14:45
paulproteussThat mispelling you can fix!14:45
hdworakwhat's up?14:45
paulproteussit's a wiki after all.14:45
hdworakI know, I'm just pointing it out first, so you can see I'm reading/working14:45
paulproteusshdworak, Ignore that, I guess.14:45
hdworakcan I start asking questions?14:46
hdworakwhat if someone misspells object (of a license predicate)? http://xeativexommons.org/14:47
paulproteussThat falls under, "All unknown values for the license predicate should be flagged."14:47
paulproteussWhich I now declare as a reasonable strategy for the validator, unless you disagree.14:47
hdworakok, flagged14:48
paulproteussJust like the W3C HTML validator flags mistakes.14:48
hdworakbut do you agree that we cannot tell whether it is a misspell or whether it represents a VALID license?14:48
paulproteussWell, we can try to later on.14:48
hdworakhttp://xeativexommons.org/ vs http://veryserious.org/ourlicenses/myblues/14:48
paulproteussWe can do things like edit distance.14:48
*** presroi has quit IRC14:49
paulproteussRight, or even python's difflib's close_matches.14:49
paulproteussSo I would like to, but I admit that attempts to do that are going to be hazy guesses.14:49
hdworakok, we found an OBJECT which has license predicate14:50
hdworakwhat do we show aside of the license itself?14:50
hdworakif it is a cc license, we show a human readable name + a link to the cc Web site with human readable stuff, right?14:50
paulproteussWe might as well show the license buttons, and we could actually validate that if they <img src> in any i.creativecommons.org buttons are for the same button.14:51
paulproteussbe back in a few minutes, lunch14:51
*** stevel has joined #cc14:54
*** jgay has quit IRC15:01
*** encompass has quit IRC15:07
*** stevel has quit IRC15:13
*** kristallpirat has quit IRC15:19
*** Mihai` has joined #cc15:30
*** stevel has joined #cc15:42
hdworakpaulproteus, have mercy15:47
hdworakit's almost 11pm here :)15:47
paulproteusshdworak, back!15:48
hdworakcould we pls continue?15:48
paulproteussLet's see.15:48
hdworakok, so if we detect a license URI we check if it matches the cc images ?15:49
hdworakthat are embedded on the page?15:49
paulproteussThat seems like a nice thing to do.15:51
paulproteussSo, sure.15:51
hdworakwhat if someone put an image only on his/her page, w/o any RDFa/RDF/links and thinks it's enough15:51
paulproteussThen we should say it Fails Validation because it has no embedded metadata, but we did detect this image which means you should probably add this metadata: ______15:51
hdworakok, what else is shown if we do find a triple about cc license aside of the images?15:54
paulproteussWell, information about the chosen license.15:56
paulproteussThere is RDFa on the license URI's page that would probably make for a good start.15:57
hdworakthe one embedded in... a comment?15:58
paulproteussNo!  The license web page itself has RDFa....15:59
hdworakyes, I'm just looking at it15:59
hdworakthere is RDF embedded in a comment15:59
hdworakplus this: http://creativecommons.org/licenses/by/3.0/rdf15:59
paulproteussWell, basically, "show some information about the license"15:59
hdworakI'm looking at http://creativecommons.org/licenses/by/3.0/15:59
paulproteussOnce we're showing something, we can easily change precisely what we show.16:00
hdworakbut isn't it just better to link to this page: http://creativecommons.org/licenses/by/3.0/16:00
hdworakit's so user-friendly16:00
paulproteussWell, sure - but when we say, "You have valid metadata!" I think we should also say: "Here is what that metadata says:"16:00
*** stevel_ has joined #cc16:01
hdworakok, what about the other stuff like other triples of the same object16:01
hdworakwe ignore them?16:01
paulproteussTriples of the same object - can you explain what you mean by that?16:02
hdworaksure, like I would have16:03
hdworak<div xmlns:dc="http://purl.org/dc/elements/1.1/"><h2 property="dc:title">The Trouble with Bob</h2> <a rel="license" href="cc.org">cc-by</a></div>16:04
hdworakdoes the license here belongs to the div or to the whole document?16:05
paulproteussOh, er, whatever the RDFa spec says.16:05
paulproteussI don't remember myself.16:05
paulproteussBut it seems like, to the whole document.16:05
hdworakif I would id="foobar" with that div16:05
paulproteussWell, here's the key - whatever statements the document makes, we should make very clear.16:06
hdworakbut we care only about the license-related stuff?16:06
hdworakwe ignore other RDF/RDFa-embedded information?16:06
paulproteussYes, *except* as it relates to the license stuff.16:07
paulproteussSo I mean, if someone uses a web page to declare that a separate URI has dc:title "Your mom" and license CC by 3.0 US then we should say, "You made these claims about this other document with license CC by 3.0 US: dc:title "Your mom" (etc)16:07
paulproteussThat's what makes sense to me.16:07
hdworakbut this dc:title etc. is shown in a simple table?16:08
paulproteussSure, I think that makes sense.16:09
hdworakit's not like I'm starting to process dc:title as Title etc.?16:09
paulproteussYeah, I don't think you need to do that.16:09
paulproteussIf there's time at the end or feedback from people says they really want it, then maybe you would want to.16:10
hdworakwhat about this means:16:12
hdworaklicense="Licensed for use under Creative Commons Attribution 2.0."16:12
hdworakif someone describes the license in a human-readable, instead of a cc URI16:12
paulproteussIf they specify a dc:license but not as a URI, then we should flag it and say, "We want you to include a license attribute as a URI."16:12
hdworakwhat's the meaning of the "Defining dc:rights" in this document?16:15
hdworakthis relates to the RDF associated with the LICENSE not with the SUBJECT of a license?16:16
hdworakis that correct?16:16
*** stevel has quit IRC16:16
hdworakbears reassemblence to http://creativecommons.org/licenses/by/3.0/rdf16:16
hdworakor is the the kind of RDF that we can see inside a HTML comment on a Web site?16:18
paulproteussWell, "RDF in a comment" is supposed to not be around anymore.16:18
hdworakyes, but I'm suppose to cover all the means, including deprecated16:18
paulproteussI guess you might find RDF like that in a <link rel="meta">16:18
hdworakand directly embedded in HEAD or BODY?16:18
paulproteussI think this standard remains current.16:18
hdworakand data: URI16:18
hdworakso it would be an RDF like that?16:19
paulproteussIt's just that the only place that's a good idea to put it nowadays is <link rel="meta">/16:19
*** Yaco has joined #cc16:19
paulproteussdata: URI in the <link rel="meta"> would be okay nowadays, as would <link rel="meta" href="URL pointing to RDF like that">16:19
hdworakso if one person would take all the code from here:16:20
hdworakand encode it to a data: URI16:20
hdworakit is just as good as linking to cc?16:20
paulproteussUh, I'm not sure.  nathany, thoughts?16:20
hdworakfrom the semantic POV16:20
paulproteussHmm, I guess so.16:20
nathanyno, its retarded16:20
hdworakbecause I want to imagine an example of such RDF16:20
hdworakwith info about the license16:21
nathanybecause that URI only makes assertions about the license16:21
nathanyit doesn't say *anything* about how the document is licensed16:21
paulproteussRight, of course.16:21
paulproteussAnd it's not a complete statement of the license.16:21
nathany(nevermind my general feeling that data: URIs are a little silly)16:21
hdworakok, so is there an example of the RDF that was supposed to be there inside comments/head/body/data: uri16:22
hdworakbecause if it's not http://creativecommons.org/licenses/by/3.0/rdf and it's not the code from the third example from http://wiki.creativecommons.org/Extend_Metadata16:22
hdworakwhich bear a strikinig reassemblance16:22
hdworakthen I do not know how such RDF should look like16:23
*** nathany has quit IRC16:23
hdworakthe top level is rdf:Description16:23
hdworakin the example form wiki we reference the license using cc:license16:24
paulproteussI'm trying to understand the question, hold on.16:27
paulproteussI have the feeling that it's not a blocker for you to continue somewhat; you can mark this down as TBD and we can ask nathany when he comes back.16:27
paulproteussIs that okay for now?16:28
hdworakI'm gonna try to decode this data: URI16:29
paulproteussBTW, where are you recording these answers?  It feels a lot like we're writing the spec.16:29
hdworakfrom the Wiki16:29
hdworakbecause it's probably the answer16:29
hdworakpaulproteus: we have a log from this channel, don't we?16:29
hdworakI'm noting all the answers ofc16:29
hdworakok, using http://meyerweb.com/eric/tools/dencoder/16:30
hdworakthe data URI yielded http://pastebin.com/f143f5fdf16:30
paulproteussWe do have a log, but these questions are forming a spec that should go on your SoC page.16:31
paulproteussThey'll be useful for writing test cases.16:31
hdworakofc = of course16:31
paulproteussOkay, great;.16:31
hdworakso if I see such an RDF16:32
hdworakI try to retrieve information about the licences16:32
paulproteuss"retrieve" by downloading http://creativecommons.org/licenses/by-nc-nd/2.0/ and http://www.eff.org/IP/Open_licenses/eff_oal.html ?16:32
hdworakand the rest of the data (dc:creator, dc:description) about the object goes into a table16:32
hdworakwell, that's what I'm asking16:32
paulproteussWell, truth be told, you'll probably be able to look up information about CC licenses in the CC licenses' RDF files without crawling them every time someone makes a request.16:33
paulproteussThat is to say, you'll have a copy of the CC licenses' data in RDF sitting on disk next to your application.16:33
paulproteussAll you really need to do for a validator is tell if the license attributes point to a valid CC license.16:33
hdworakwhat are the possible methods to embed license info in an RSS feed?16:35
hdworakare there any examples?16:36
hdworakis anybody doing it?16:36
paulproteussI think blip.tv is doing it.16:36
hdworakok, so RSS 1.0 and Atom 1.0 uses rel="license"16:38
hdworakand RSS 2.0 a qualified name16:39
hdworakand the methods listed on Wiki are the only ones we check, is that right?16:39
hdworak(when it comes to feeds)16:39
hdworakor are there any secret unlisted methods left?16:39
hdworakor historical, deprecated ones?16:39
paulproteussI think here, we only check the ones listed on the wiki.16:40
hdworakwhat other filetypes should the validator handle?16:40
hdworakI know about http://wiki.creativecommons.org/Category:Filetype16:41
hdworakbut what should it handle16:41
paulproteussMost of Category:Filetype can be handled by liblicense, anyway, which you can easily hook into at some point.16:41
paulproteussI think just web stuff, so just feeds + web pages + RDF.16:41
paulproteussliblicense is a project of mine (former intern project, actually) that can read license info out of lots of media files.16:41
paulproteussIt has Python bindings.16:42
hdworakfeed+Web pages+RDF16:42
hdworakso we ignore other well-formed XML?16:42
hdworakthe validator can say:16:42
hdworak"sorry, but this isn't RSS, ATOM, HTML, XHTML, or RDF"16:43
paulproteussOh, interesting.  I guess if it has assertions, then you could handle it.  But that's fairly Low Priority.16:43
*** Yaco has quit IRC16:43
paulproteussSo it's up to you - what I suggest is decide not to handle it at first, and then toward the end if there's time think about doing it.16:43
*** nathany has joined #cc16:44
nathanypaulproteuss: what're the odds that just reconfiguring the jdk on a8 will "heal" it? :)16:45
paulproteussnathany, It needs healage?16:45
nathany(and what's the flag to dpkg to reconfigure)16:45
paulproteussdpkg-reconfigure $package16:45
nathanynathan@a8:~$ /usr/lib/jvm/java-1.5.0-sun/bin/java -version16:45
nathanyError: no `client' JVM at `/usr/lib/jvm/java-1.5.0-sun-'.16:45
paulproteussa8:/var/www# java16:45
paulproteuss-su: java: command not found16:45
hdworakok, so it all can be parsed from direct input16:45
hdworak'cause they are all text formats, not binary16:45
hdworakhow many levels of crawling can we have?16:46
paulproteusshdworak, I don't know, start at 3 and we'll see if we ever need to exceed that.16:46
* paulproteuss shrugs16:46
paulproteussThat's a question where you can make a decision without asking me, and mention the decision you made later.16:47
hdworaksomeone pastes a link to a Web page he/she wants to check16:47
hdworakthat Web page has a link to RDF (as <link rel="meta")16:47
hdworakwe download that (crawl depth 2)16:47
hdworaknow that RDF has a cc URI to a license16:47
hdworakwe download that (crawl depth 3) - if it's not on HDD already16:48
hdworakis there anything else missing? something that can be depth 4 ?16:48
paulproteussnathany, I fixed java16:50
paulproteussI just purged and reinstalled the package.16:50
nathanythank you16:50
paulproteussI copied over /etc/ wholesale.16:50
paulproteusshdworak, Nothing I can think of, but you should log an error if you happen to need deeper than that.16:50
paulproteussnathany, BTW16:51
paulproteussI didn't copy the tomcat stuff back in - I don't know what directories they are.16:51
*** stevel_ has quit IRC16:51
hdworakare RDFa tools/parsers helpful when it comes to RDF parsing and vice-versa (RDF parsers for RDFa)?16:51
paulproteussRDFa parsers are good for getting data into RDF data model libraries.16:51
paulproteussYou'll notice that librdfa links to libraptor.16:51
paulproteussnathany, If you list directories for me, I can restore them from /usr or /var or wherever.16:52
paulproteussOr you can yourself, if you prefer.16:52
hdworakso RDFa to RDF and then parse RDF like it was there from the very beginning?16:54
hdworakI've got no more questions written as for now16:55
hdworakthanks for all the answers16:55
hdworakthey helped me to understand your expectations better16:55
paulproteusshdworak, Okay, great - now please record these on the wiki or something!16:55
hdworakwill do, but not tonight16:55
hdworakit's midnight16:56
paulproteussOkay (-:16:56
paulproteussnathany, You can run (I won't):16:57
paulproteussa8:~# /pull-from-ia32.sh /usr/local/nutch16:57
paulproteussIt should work according to your expectations.16:57
ankitgIt's almost 6 am here ... c'mon Hugo! (-:16:57
paulproteusslol, ankitg!16:58
ankitgand I am at the lab in Uni even ... on Mibbit coz my Uni blocks / throttles / does something bad to IRC ...16:59
paulproteussankitg, jhu.edu blocks IRC.16:59
hdworakankitg: I use to do that a couple of years ago16:59
paulproteussIt was pretty evil of them.16:59
hdworakankitg: now I know it doesn't make sense16:59
paulproteussBut you should email network security and explain that you need it for your summer internship with us.16:59
paulproteussDon't tell them about mibbit, just annoy them.16:59
hdworakankitg: I guess I become an oldgeezer16:59
ankitgpaulproteuss: i'll do that when I come back from India, I should be fine when I am there (-:17:00
paulproteussankitg, Well, might as well email them before you go to give them time to twiddle their thumbs.17:00
ankitgsure ...17:01
ankitgmaybe I'll CC you for good measure ... as if to say, see it's legit, I am not cooking stuff up :P17:01
nathanypaulproteuss: is this expected?17:01
nathanyh$ sudo /pull-from-ia32.sh /usr/local/nutch17:02
nathanyGoing to rescue /usr/local/nutch in two seconds...17:02
nathanyControl-c now if you dislike that plan.17:02
nathanymv: cannot move `/ia32//usr/local/nutch' to `/ia32.trash/16245./home/nathan/oenutch': No such file or directory17:02
paulproteussnathany, /me gulps17:02
paulproteussHah, not really.17:02
paulproteussA typo on my part, but not a big deal.17:02
* paulproteuss fixes17:02
paulproteuss(fixed; no need to re-run the script; it did the copy okay)17:02
nathanythere is no /usr/local/nutch :)17:03
paulproteussIt put it in $PWD/nutch17:03
nathanyoh, is it because it's a symlink?17:03
paulproteussTypo again, fixing.17:03
paulproteussAnd it copies, not moves, so this isn't a big deal (-;17:03
nathanyuh, which $PWD?17:03
nathanyi  don't see it17:03
paulproteussWell try it again anyway.17:04
paulproteussI think it should work this time.17:04
paulproteussOh, wait, the "mv" will error out again.17:04
paulproteussYeah, wait.17:04
nathanyuh, ok :)17:04
paulproteussName another directory to import.17:06
*** Mihai` has quit IRC17:08
*** pmiller has joined #cc17:10
nathanypaulproteuss: that's the only one i know of (although it does appear to have followed the link instead of preserving it)17:10
nathanynot a big deal in this case17:10
paulproteussOkay (oops)17:10
paulproteussInterestingly I think /etc/alternatives may not quite be right also.17:11
paulproteussNot in any dangerously terrible way as far as I can see.17:11
paulproteussVarnish and Apache are up.17:15
nathanyi'm working on restoring the tomcat stuff17:16
nathany(rebuilding from an up to date checkout)17:16
*** stevel has joined #cc17:17
paulproteussGreat.  I could move the sites back, but that isn't interesting to me anymore so I'll do it later tonight.17:17
paulproteussnathany, done - I just "mv"d them.17:21
nathanypaulproteuss: thanks much!17:21
nathanypaulproteuss: nutch still seems borked but i'll fix (just fyi in case you're fixing your script)17:22
paulproteussNope, I'm abandoning my script for now.17:22
*** grahl has quit IRC17:23
paulproteussnathany, Oh, stand?17:25
nathanyyes, 2 min17:26
paulproteussGreat, just IRC ping me17:26
*** hdworak has quit IRC17:30
*** greg-g has quit IRC17:51
*** stevel has quit IRC18:24
*** stevel has joined #cc18:28
*** UltraMagnus has joined #cc18:28
*** presroi_ has quit IRC18:39
*** nathany has quit IRC18:41
*** bovinity has quit IRC18:57
*** ajbrooks has quit IRC19:04
*** BobChao has joined #cc20:01
*** BobChao has left #cc20:02
*** BobChao has joined #cc20:08
*** Yaco has joined #cc20:09
*** stevel has quit IRC20:14
*** gdsf has joined #cc20:17
*** gdsf is now known as bring320:17
*** tvol has joined #CC20:20
*** UltraMagnus has quit IRC20:29
*** bring2 has quit IRC20:30
*** rejon has quit IRC20:31
*** ajbrooks has joined #cc20:45
*** stevel has joined #cc20:49
*** jgay has joined #cc21:06
*** BobChao has quit IRC21:12
*** paulproteus has quit IRC21:33
*** paulproteus has joined #cc21:34
*** paulproteuss has quit IRC21:34
*** stevel has quit IRC21:48
*** jgay has quit IRC22:10
*** rejon has joined #cc22:21
*** tvol has quit IRC22:38
*** Yaco has quit IRC22:54
*** Yaco has joined #cc22:55
*** ankitg has quit IRC23:29

Generated by irclog2html.py 2.6 by Marius Gedminas - find it at mg.pov.lt!