| *** parker-fcnyu has joined #cc | 00:10 | |
| *** mralex has quit IRC | 00:27 | |
| *** nkinkade has quit IRC | 01:16 | |
| *** jgay has quit IRC | 01:28 | |
| *** erlehmann has quit IRC | 01:33 | |
| *** pktck has joined #cc | 01:44 | |
| *** erlehmann has joined #cc | 01:46 | |
| *** pktck has quit IRC | 02:20 | |
| *** parker-fcnyu has quit IRC | 02:56 | |
| *** pktck has joined #cc | 03:02 | |
| *** parker-fcnyu has joined #cc | 03:25 | |
| *** oshani has quit IRC | 05:32 | |
| *** JoiIto has joined #cc | 05:38 | |
| *** Kaetemi has quit IRC | 06:03 | |
| *** JoiIto has quit IRC | 06:27 | |
| *** parker-fcnyu has quit IRC | 06:32 | |
| *** MarkDude has quit IRC | 07:20 | |
| *** erlehmann has quit IRC | 07:27 | |
| *** pktck has quit IRC | 07:29 | |
| *** pktck has joined #cc | 07:40 | |
| *** MarkDude has joined #cc | 08:09 | |
| *** pktck has quit IRC | 08:19 | |
| *** JED3 has quit IRC | 08:35 | |
| *** JED3 has joined #cc | 08:35 | |
| *** pktck has joined #cc | 08:40 | |
| *** wormsxulla has quit IRC | 08:46 | |
| *** niekie has quit IRC | 09:01 | |
| *** niekie has joined #cc | 09:05 | |
| *** pktck has quit IRC | 09:24 | |
| *** wormsxulla has joined #cc | 09:34 | |
| *** FHaag has joined #cc | 09:44 | |
| *** pktck has joined #cc | 09:45 | |
| *** pktck has quit IRC | 09:51 | |
| *** bassel has joined #cc | 09:59 | |
| *** pktck has joined #cc | 10:03 | |
| *** MarkDude has quit IRC | 10:06 | |
| *** MarkDude has joined #cc | 10:07 | |
| *** pktck has quit IRC | 10:08 | |
| *** pktck has joined #cc | 10:28 | |
| *** pktck has quit IRC | 10:39 | |
| *** pktck has joined #cc | 10:42 | |
| *** pktck has quit IRC | 10:47 | |
| *** akila87 has joined #cc | 10:52 | |
| *** oshani has joined #cc | 11:19 | |
| *** oshani has quit IRC | 11:21 | |
| *** bassel has quit IRC | 12:12 | |
| *** bassel has joined #cc | 12:12 | |
| *** bassel has quit IRC | 12:57 | |
| *** tvol has joined #cc | 13:04 | |
| *** oshani has joined #cc | 13:08 | |
| *** bassel has joined #cc | 13:09 | |
| *** midoubleko has quit IRC | 13:44 | |
| *** midoubleko has joined #cc | 13:45 | |
| *** midoubleko has quit IRC | 13:50 | |
| *** midoubleko has joined #cc | 13:50 | |
| *** paroneayea has quit IRC | 13:52 | |
| *** paroneayea has joined #cc | 13:56 | |
| *** igorlukanin has joined #cc | 14:06 | |
| *** akila87 has left #cc | 14:08 | |
| *** parker-fcnyu has joined #cc | 14:20 | |
| *** akila87 has joined #cc | 14:28 | |
| *** nkinkade has joined #cc | 14:52 | |
| *** Pascalcmoi has joined #cc | 15:22 | |
| Pascalcmoi | A website using Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License mean that only usa redisent can legaly read the page? | 15:23 |
|---|---|---|
| paroneayea | Pascalcmoi: no, it just means that the legal code is tuned particularly to the legal system of United States law | 15:23 |
| Pascalcmoi | thanks paroneayea | 15:24 |
| paroneayea | np | 15:24 |
| *** Pascalcmoi has left #cc | 15:25 | |
| paroneayea | nkinkade: Yo | 15:31 |
| paroneayea | ah, never mind | 15:31 |
| *** jed_ has joined #cc | 15:34 | |
| *** JED3 has quit IRC | 15:37 | |
| paroneayea | well | 15:38 |
| paroneayea | we're running sanity again | 15:38 |
| paroneayea | now with caching of RDF queries and new zeland added to jurisdictions.rdf | 15:39 |
| *** oshani has quit IRC | 15:41 | |
| *** Kaetemi has joined #cc | 15:44 | |
| *** Kaetemi has joined #cc | 15:44 | |
| *** jed_ has quit IRC | 15:59 | |
| *** oshani has joined #cc | 16:04 | |
| *** tieguy has joined #cc | 16:12 | |
| *** erlehmann has joined #cc | 16:27 | |
| *** mralex has joined #cc | 16:49 | |
| *** jed_ has joined #cc | 16:50 | |
| *** erlehmann has quit IRC | 16:50 | |
| *** igorlukanin has quit IRC | 16:50 | |
| nkinkade | paroneayea: What's the bit about cc.engine taking a while to start up after restarting Apache? | 16:53 |
| nkinkade | How long? | 16:53 |
| paroneayea | nkinkade: a few seconds | 16:53 |
| nkinkade | Oh, that's not too bad. Do you think the problems with paster running away with memory should be gone too? | 16:54 |
| nkinkade | Before a script had to run every few minutes to check memory usage adn would have to kill paster about every 7 or 8 hours to get it to release memory. | 16:54 |
| paroneayea | I hope so, but I'm not sure what would have been causing the memory load in paster | 16:55 |
| nkinkade | NRY had some ideas, but I never knew/understood just what they were. | 16:55 |
| nkinkade | We'll know soon enough, because I get mails when the script reloads paster, but I suspect I'll need to change the script and pid files may have moved around. | 16:55 |
| paroneayea | that's part of the thing | 16:56 |
| paroneayea | we switched this over to fastcgi so | 16:56 |
| paroneayea | there's no separate daemon | 16:56 |
| paroneayea | apache / mod_fcgid is starting and managing the process | 16:56 |
| paroneayea | and handling forking and etc | 16:56 |
| nkinkade | paroneayea: Cool and I see that the cc.engine pid is still at /var/run/cc.engine.pid | 16:57 |
| nkinkade | And the script should still work if necessary. | 16:57 |
| paroneayea | hm, no.. I don't think that does anything | 16:57 |
| paroneayea | I just haven't shut off the old system | 16:57 |
| nkinkade | Oh. | 16:57 |
| paroneayea | in case we need to switch things over fast | 16:57 |
| paroneayea | so technically we are running doubletime | 16:58 |
| paroneayea | two cc.engines are running right now | 16:58 |
| *** FHaag has left #cc | 16:58 | |
| nkinkade | How long does it take to shut the old system down? Nothing more than /etc/init.d/cc-engine stop/start, right? | 16:58 |
| paroneayea | yeah | 16:58 |
| paroneayea | you have to do it from the old cc.engine directory | 16:58 |
| nkinkade | The old init script no longer works? | 16:58 |
| paroneayea | it does | 16:58 |
| paroneayea | but you always seemed to need to run it from there | 16:59 |
| paroneayea | if I ran it from anywhere else it did nothing | 16:59 |
| nkinkade | I never found that to be true. Hmm. | 16:59 |
| nkinkade | Can I try it now? | 16:59 |
| paroneayea | go for it | 16:59 |
| paroneayea | shouldn't affect the running engine at all | 16:59 |
| nkinkade | I just ran it from my home directory and it worked. | 17:00 |
| nkinkade | It also just reclaimed about 500M of memory. | 17:00 |
| paroneayea | ah :) | 17:01 |
| nkinkade | Which can now be used for other things. | 17:01 |
| paroneayea | yay! | 17:01 |
| nkinkade | Like disc caching, etc. | 17:01 |
| nkinkade | Cool. | 17:01 |
| paroneayea | anyway cc.engine has been running all morning, I've noticed no problems, and nobody's emailed webmaster with any new problems | 17:02 |
| paroneayea | heading out to lunch | 17:11 |
| *** niekie has quit IRC | 17:14 | |
| *** igorlukanin has joined #cc | 17:16 | |
| *** michi__ has joined #cc | 17:16 | |
| *** niekie has joined #cc | 17:19 | |
| *** akila87 has left #cc | 17:20 | |
| *** niekie has quit IRC | 17:27 | |
| *** niekie has joined #cc | 17:31 | |
| *** niekie has quit IRC | 17:38 | |
| *** niekie has joined #cc | 17:40 | |
| *** tvol has quit IRC | 17:56 | |
| *** michi__ has quit IRC | 17:57 | |
| nkinkade | I just realized that Gmail for Google Apps has been sending all info@ and webamaster@ emails to spam! | 17:57 |
| paroneayea | nkinkade: oh no! :( | 18:14 |
| nkinkade | It's not uncommon for Google to send legitimate email to spam, which is why I check over it every single day. I'm just not sure why I didn't catch it earlier. | 18:16 |
| *** igorlukanin has quit IRC | 18:48 | |
| *** akozak has joined #cc | 18:50 | |
| dithyramble | hey akozak | 18:57 |
| akozak | hey | 18:57 |
| akozak | dithyramble, just wondering when you were flying out to see if we could get to the airport together | 18:58 |
| paulproteus | akozak: FWIW I'm heading to greg-g's place Thu night | 18:58 |
| dithyramble | I'm flying out of LAN at 7:11pm (on United 5702) | 18:58 |
| paulproteus | Wanna join us? (-: | 18:58 |
| paulproteus | (Then I'm flying out of DTW that night, I think 7 PM) | 18:59 |
| paulproteus | (er, I'm flying out of DTW Fri night at 7 PM) | 18:59 |
| paulproteus | Let me rephrase it. You should join us. | 18:59 |
| paulproteus | You can work remotely on Friday, surely. | 19:00 |
| akozak | paulproteus, thanks, but I shouldn't be away from home that long... will still be in the process of moving. | 19:00 |
| paulproteus | Aww, okaaaaayyyyyy. | 19:00 |
| *** midoubleko has quit IRC | 19:00 | |
| akozak | yea, I'm sure it would be fun :) | 19:00 |
| *** midoubleko has joined #cc | 19:00 | |
| akozak | dithyramble, ok perfect, thanks | 19:01 |
| *** wormsxulla has quit IRC | 19:13 | |
| *** parker-fcnyu has quit IRC | 19:15 | |
| paroneayea | well | 19:25 |
| paroneayea | nkinkade: regarding cpu usage on a5 | 19:27 |
| paroneayea | it looks like most often it's all the apache processes' | 19:27 |
| nkinkade | paroneayea: What do you think? | 19:27 |
| paroneayea | looking at top | 19:27 |
| paroneayea | nkinkade: what do you say we switch over to nginx | 19:27 |
| paroneayea | kidding kidding kidding | 19:28 |
| paroneayea | in seriousness though, it could simply be that a5 just gets a lot of http traffic all the time | 19:29 |
| nkinkade | paroneayea: So does a8, much more in fact, but a8 is less loaded. | 19:29 |
| paroneayea | hm | 19:29 |
| nkinkade | Granted a8 is mostly small static files. | 19:29 |
| nkinkade | :-) | 19:29 |
| paroneayea | is a8 where the buttons are then? | 19:29 |
| paroneayea | the embeddable images I mean | 19:30 |
| paroneayea | I've always thought that must be a huge amount of http traffic | 19:30 |
| nkinkade | Yeah, a8 hosts all those icons and buttons. | 19:31 |
| nkinkade | a5 right now is pumping out steadily around 600KB/s to 800KB/s | 19:31 |
| paroneayea | what could make apache really cpu intensive? | 19:31 |
| paroneayea | maybe a lot of rewrite rules and etc? | 19:32 |
| *** wormsxulla has joined #cc | 19:32 | |
| nkinkade | Varnish's hitrate seems to hover around 95%, which doesn't seem *too* bad, though I guess it could be better. | 19:34 |
| nkinkade | And we have APC caching PHPs opcode and that all looks nice: http://a5.creativecommons.org/apc.php | 19:35 |
| nkinkade | A 100% hit rate for APC. | 19:35 |
| nkinkade | a5 does have a quite a few rewrite rules. | 19:35 |
| paroneayea | I turned on the rewrite log for a bit | 19:36 |
| paroneayea | it looked like every static file hit hits a LOT of rules | 19:36 |
| nkinkade | paroneayea: Yeah, don't leave the log on for long. | 19:39 |
| nkinkade | Especially if your loglevel is more than 5 or 6 | 19:39 |
| nkinkade | It will produce vast amounts of data and just slow things down even more. | 19:39 |
| nkinkade | To see how many rewrite rules it has to hit, one just needs to look at the vhost config in confg/ | 19:40 |
| nkinkade | conf/ | 19:40 |
| nkinkade | We are also gzipping things when a client can accept it, so that may be taking some CPU, but I would expect Varnish to cache the gzipped response. | 19:41 |
| paroneayea | running pages from the new license engine through the validator | 19:42 |
| paroneayea | it's not validating, though the templates should be the same as the old engine | 19:42 |
| paroneayea | though other pages on cc.org aren't validating either :( | 19:42 |
| nkinkade | Most pages on CC.org have never validated. | 19:46 |
| paroneayea | Looks like there's another RDF issue with the new engine | 19:47 |
| paroneayea | as in, since the new engine uses the RDF as the database | 19:47 |
| paroneayea | it's exposing problems we have with missing things in our RDF | 19:47 |
| nkinkade | paroneayea: Did you just see that webmaster@ email? | 19:48 |
| paroneayea | yeah | 19:48 |
| paroneayea | http://creativecommons.org/international/br/ <- the licenses on here not showing up | 19:48 |
| paroneayea | and here's why: | 19:49 |
| paroneayea | we only have RDF for 2.5 licenses w/ brazil | 19:49 |
| paroneayea | not 3.0 | 19:49 |
| nkinkade | Seems like maybe the unit tests should check this ... run through each jurisdiction and fetch the deed. If a 404 comes back, the the test fails. | 19:49 |
| paroneayea | but based on what data? | 19:51 |
| paroneayea | according to the RDF this is correct | 19:51 |
| *** JoiIto has joined #cc | 19:51 | |
| paroneayea | we don't have those licenses in the RDF is what | 19:51 |
| paroneayea | I could put together a scraper possibly that checks pages like: http://creativecommons.org/international/br/ | 19:51 |
| paroneayea | and looks for all the licenses they're expecting to exist | 19:52 |
| nkinkade | paroneayea: It's ugly but it wouldn't be hard to scrape international/ | 19:52 |
| paroneayea | yeah | 19:52 |
| nkinkade | Then again paulproteus might just find it beautiful. | 19:52 |
| paroneayea | :) | 19:52 |
| nkinkade | He's like that. :-) | 19:52 |
| paroneayea | why don't I roll back the engine and write a tool/test to do that | 19:53 |
| paroneayea | so we can make sure we're not screwing over any other jurisdictions with missing RDF data | 19:53 |
| nkinkade | paroneayea: As a suggestion, you could also look into grabbing the data directly from the WP database. | 19:58 |
| nkinkade | But I guess a list of the juris. is not what you need. | 19:59 |
| *** oshani has quit IRC | 19:59 | |
| nkinkade | And the db has nothing about version number, I think. | 19:59 |
| nkinkade | Which apparently is what you need. | 20:01 |
| nkinkade | paulproteus: dithyramble: Where on a6 is the crawl data being stored? | 20:04 |
| nkinkade | Do you feel that it's vital that it be backed up? | 20:04 |
| paulproteus | nkinkade: We haven't really done anything with a6. We're developing locally. | 20:04 |
| nkinkade | Ah. | 20:04 |
| nkinkade | Somewhere something is eating up a lot of space in the last week. | 20:05 |
| paulproteus | Based on this conversation, what I'll do is remember to tell you when we deploy and start wanting backups. | 20:05 |
| paulproteus | Huh. | 20:05 |
| nkinkade | paulproteus: Soon it shouldn't matter. | 20:05 |
| paulproteus | Okay. | 20:05 |
| paulproteus | (Because of the Rapture?) | 20:05 |
| *** oshani has joined #cc | 20:05 | |
| nkinkade | Once we move over to hosting at the ISC (hopefully) we'll have more bandwidth and also a lot more disc space and I intend to backup / from top to bottom. | 20:06 |
| paulproteus | Oh my GOD | 20:06 |
| paulproteus | AWEESOME | 20:06 |
| * paulproteus is so jealous. | 20:06 | |
| nkinkade | But for the moment, we are still in the CC office and disc space is down to 35G. | 20:06 |
| paulproteus | Oh, you just mean backups' hosting? | 20:06 |
| nkinkade | paulproteus: I was being unfair to you. I noticed it go up and up over the past week and I automatically assumed it coincided with your return. :-) | 20:07 |
| paulproteus | nkinkade: Heh (-: | 20:07 |
| nkinkade | In the past, with resource issues, I could usually single you out and be right about 50% of the time. :-) | 20:07 |
| paulproteus | (-: | 20:08 |
| nkinkade | With that type of percentage, I usually just shot first and asked questions later. | 20:08 |
| nkinkade | Although a6 *is* looking suspcious: | 20:09 |
| nkinkade | backup:/media/1TB/backups/creativecommons# du -sh * | 20:09 |
| nkinkade | 62Ga5.creativecommons.org | 20:09 |
| nkinkade | 281Ga6.creativecommons.org | 20:09 |
| nkinkade | Not to say 281G is unreasonable, but a5 being our "main" machine and using on 20% of the disc space that a6 is using seems odd. | 20:10 |
| paulproteus | I'll leave this to you, nkinkade (-: | 20:10 |
| nkinkade | paulproteus: So to sum up ... you know of no places on a6 that may have lately been loaded up with lots of data. | 20:10 |
| nkinkade | ? | 20:10 |
| paulproteus | I know of nothing related to me lately on a6. | 20:10 |
| paroneayea | nkinkade: I'm going to start the old engine back up, just fyi | 20:11 |
| nkinkade | paroneayea: Cool. | 20:12 |
| paroneayea | paulproteus: I have a question for you, scraping related | 20:36 |
| paroneayea | http://creativecommons.org/international/ what's the best way to scrape for the licenses that appear under Completed Licenses vs the ones that appear under Project Jurisdictions? | 20:36 |
| paroneayea | my guess is that since they aren't distinguished by appearing in separate divs and etc there's no way to really do things via xpath | 20:37 |
| paroneayea | so will I just have to "iterate until I hit that point"? | 20:37 |
| paulproteus | Yeah, I think that's what's you have to do. | 20:39 |
| paulproteus | You could also change the template so they have a class or something. | 20:39 |
| paroneayea | yes I suppose I could look to change that page itself | 20:39 |
| paroneayea | nkinkade: are all the /international/ pages managed by wordpress, I assume? | 20:40 |
| nkinkade | paroneayea: Yes. | 20:40 |
| paulproteus | If they're managed by WordPress, then I would treat them as unchangeable. | 20:40 |
| paulproteus | And just scrape messily. | 20:40 |
| nkinkade | paroneayea: How about just selecting any all divs with class of ifloat in the first div with class icontainer? | 20:40 |
| nkinkade | I feel like BeautifulSoup would allow for that, but I haven't used it in a while. | 20:41 |
| paroneayea | because they're in the same div | 20:41 |
| nkinkade | paroneayea: From what I can tell, they are in two separate divs. | 20:42 |
| paroneayea | oh | 20:42 |
| paroneayea | wait | 20:42 |
| paroneayea | yeah you're right | 20:42 |
| paroneayea | I was being too reliant on the highlighting via firebug's inspector :) | 20:43 |
| paroneayea | which made it look like the div that held those jurisdiction icons was just a block above them | 20:43 |
| paroneayea | well never mind then, this should be very easy :) | 20:43 |
| *** bassel has quit IRC | 21:01 | |
| *** oshani has quit IRC | 21:08 | |
| *** JoiIto has quit IRC | 21:13 | |
| jed_ | just fancied up a tool that i've been using to test some of my work http://code.creativecommons.org/~john/ | 21:57 |
| *** jed_ is now known as JED3 | 21:57 | |
| nkinkade | JED3: How is this? .... http://us3.php.net/manual/en/function.mysql-fetch-assoc.php : "does not contain CC-REL Metadata." | 22:05 |
| nkinkade | Ooops ... wrong ULR paste. :-) | 22:05 |
| nkinkade | http://creativecommons.org/licenses/by-nc-sa/3.0/ | 22:05 |
| nkinkade | There it is. | 22:05 |
| JED3 | nkinkade: it worked? | 22:07 |
| nkinkade | JED3: It said: "does not contain CC-REL Metadata." | 22:07 |
| nkinkade | But that shouldn't be right for the deeds. | 22:07 |
| JED3 | well, i don't believe the deeds have any self-referential rel=license's do they? | 22:08 |
| JED3 | nkinkade: ^^ | 22:09 |
| JED3 | so i guess that message of "no cc-rel metadata" is a bit misleading | 22:10 |
| JED3 | perhaps it would be better suited as "no CC rel=license link found" | 22:10 |
| paroneayea | JED3: that's awesome! | 22:11 |
| paroneayea | well | 22:11 |
| paroneayea | it's super pretty :) | 22:11 |
| paroneayea | and minimal and nice working :) | 22:11 |
| JED3 | paroneayea: thanks! | 22:11 |
| JED3 | this is the same thing we use on the deeds for the referer checking | 22:11 |
| nkinkade | JED3: That could be, but the deeds do contain plenty of cc-rel metadata. | 22:12 |
| nkinkade | Maybe I've just misunderstood what the tool does and is for., | 22:12 |
| JED3 | nkinkade: its for extracting CC metadata from a page and displaying in a human form | 22:13 |
| JED3 | try inputting "http://joi.ito.com/" as an example | 22:13 |
| nkinkade | JED3: So it will only extract the metadata under certain circumstances? | 22:14 |
| JED3 | nkinkade: it extracts everything, but will display when its able to make assertions from a work's triples graph | 22:15 |
| JED3 | for instance if you include cc:attributionName or cc:attributionURL on a page but are not specifying a license for that work, those 2 triples are worthless for our sake | 22:16 |
| nkinkade | Cool. So it's not an all purpose cc-rel metadata extractor, but meant more for checking, for example, the marking on a site, perhaps using the chooser HTML or something similar. | 22:19 |
| JED3 | nkinkade: correct | 22:27 |
| mralex | JED3: if you really wanted to procrastinate, you could add :hover and :active states for that Scrape button ;)) | 23:23 |
| JED3 | mralex: oOo good idea | 23:23 |
| mralex | http://neography.com/experiment/circles/solarsystem/ | 23:24 |
| mralex | mmm, css3 | 23:25 |
| JED3 | mralex: mmm html5+webgl http://www.youtube.com/watch?v=OxoFcyKYwr0&fmt=22 | 23:28 |
| mralex | fun | 23:29 |
| mralex | if webgl ever goes anywhere | 23:30 |
| mralex | or the zombie uprising of vrml | 23:30 |
| akozak | dithyramble, youre flying out of GRR right? | 23:34 |
| akozak | paulproteus, do you happen to know what airport he's flying out of on the 17th? :P | 23:37 |
| akozak | I forgot to ask | 23:37 |
| akozak | paulproteus, oh nevermind | 23:37 |
| akozak | he sais LAN | 23:37 |
| akozak | said* | 23:37 |
| paulproteus | Wow I'm Full | 23:51 |
| dithyramble | akozak: at least it's not called BRR | 23:51 |
Generated by irclog2html.py 2.6 by Marius Gedminas - find it at mg.pov.lt!