nkinkade | paulproteus: Can you log into a5 for a minute? | 00:02 |
---|---|---|
nkinkade | 4 eyes are better than 2. | 00:02 |
nkinkade | Apache is acting up, or is being affected by something else ... | 00:02 |
nkinkade | Nothing has gone down yet, but it may. | 00:02 |
paulproteus | nkinkade, Conveniently I have four eyes already! | 00:03 |
* paulproteus sshs to a5 | 00:03 | |
nkinkade | paulproteus: pid 24204 | 00:03 |
nkinkade | I'm already stracing it, but nothing overly interesting: rt_sigtimedwait(~[ILL TRAP ABRT BUS FPE KILL SEGV USR2 PIPE CONT STOP SYS RTMIN RT_1], | 00:04 |
nkinkade | I don't know what it's waiting on. | 00:04 |
paulproteus | Can you ^C your strace for me? | 00:04 |
nkinkade | paulproteus: Done. | 00:05 |
nkinkade | swap is growing and the machine could become unresponsive. | 00:05 |
nkinkade | I've already got the first Nagios message that Web Services is down, so Apache is hit pretty hard. | 00:05 |
nkinkade | It can't even return a 20 or 30 byte static HTML file. | 00:06 |
paulproteus | #0 0x00002b1b0345e1c3 in do_sigwait () from /lib/libpthread.so.0 | 00:06 |
paulproteus | #1 0x00002b1b0345e25d in sigwait () from /lib/libpthread.so.0 | 00:06 |
paulproteus | #2 0x00002b1b03007a5b in apr_signal_thread () from /usr/lib/libapr-1.so.0 | 00:06 |
paulproteus | #3 0x00002b1b074ae70c in ?? () from /usr/lib/apache2/modules/mod_wsgi.so | 00:06 |
paulproteus | It's stuck in some mod_wsgi threading mess. | 00:06 |
paulproteus | Sounds no good to me; we could kill it abruptly. | 00:06 |
paulproteus | Should only interrupt one request at most. | 00:06 |
paulproteus | Shall I? | 00:06 |
paulproteus | If other methods don't work, gdb'ing in and doing "call exit(0)" should do the trick. | 00:07 |
nkinkade | paulproteus: Please do kill it. | 00:07 |
paulproteus | Er, now it's not quite responding to my touch. | 00:08 |
* paulproteus keeps waiting | 00:08 | |
nkinkade | For me either. | 00:08 |
nkinkade | I was afraid it might spiral downward. | 00:08 |
nkinkade | paulproteus: Did you use gdb to get that info above? | 00:08 |
paulproteus | Yes. | 00:08 |
paulproteus | "sudo gdb -p $PID", then ^C to get a gdb prompt, then "bt" to get a backtrace. | 00:09 |
nkinkade | That's good information. | 00:09 |
nkinkade | At least we now have some minimal idea of what might be causing it. | 00:09 |
paulproteus | I emailed about this a bit ago in a discussion of core dumps I think. | 00:09 |
paulproteus | Ya. | 00:09 |
nkinkade | I may have to Rapid Reboot the thing. | 00:10 |
paulproteus | Really? | 00:10 |
paulproteus | Gosh. | 00:10 |
paulproteus | One sec. | 00:10 |
nkinkade | Or not. | 00:10 |
paulproteus | (gdb) call exit(0) | 00:10 |
paulproteus | And now we let the OS clean that sucker up. | 00:10 |
paulproteus | ...which is taking a while. | 00:11 |
paulproteus | Sweet! done. | 00:11 |
paulproteus | That's my favorite trick in the whole book. | 00:11 |
nkinkade | call exit(0)? | 00:11 |
paulproteus | Yes. | 00:11 |
paulproteus | It's the Jedi Mind Trick for processes that don't want to exit. | 00:11 |
nkinkade | I've learned something new. | 00:11 |
* paulproteus waves hands and flashes the debugger | 00:12 | |
paulproteus | "You want to exit successfully." | 00:12 |
paulproteus | (gdb) call exit(0) | 00:12 |
paulproteus | Program exited normally. | 00:12 |
paulproteus | The program being debugged stopped while in a function called from GDB. | 00:12 |
paulproteus | It's a pretty evil trick. | 00:12 |
nkinkade | It must be the metadata_scraper causing that. | 00:12 |
paulproteus | In theory I should maybe call _exit(0) to be even more evil. | 00:13 |
paulproteus | Did you look at whose child it was before we killed it? | 00:13 |
paulproteus | I guess we're now guessing. | 00:13 |
nkinkade | I don't of anything else that uses mod_wsgi | 00:13 |
paulproteus | Okay, that's good to know. | 00:13 |
nkinkade | At least I don't think anything else uses it. | 00:13 |
nkinkade | nathany: Is that correct as far as you know? That the only thing on a5 using mod_wsgi is the metadata_scraper? | 00:14 |
paulproteus | <paulproteus> #2 0x00002b1b03007a5b in apr_signal_thread () from /usr/lib/libapr-1.so.0 | 00:14 |
paulproteus | I wonder if that's the Apache Portable Runtime trying to respond to your signals. | 00:14 |
paulproteus | Like your kill signals. | 00:14 |
nkinkade | paulproteus: Which kill signals? | 00:15 |
paulproteus | Didn't you do kill -SOMETHING $PID? | 00:15 |
nkinkade | paulproteus: Nope. | 00:15 |
paulproteus | Oh, okay. | 00:15 |
paulproteus | Somehow I thought you had. | 00:15 |
paulproteus | My way is a little more abrupt but much more guaranteed to work. | 00:15 |
nkinkade | paulproteus: However, it is possible that the watchdog tried to restart Apache and/or Varnish. | 00:15 |
paulproteus | I seem to be mistaken about the purpose of that. | 00:15 |
paulproteus | It would be nice to have debugging symbols for these libraries so these stack traces are a little more useful. | 00:16 |
nkinkade | I only logged in, saw the process that was seemingly hung, and then straced it, to no avail. | 00:16 |
* paulproteus installs libapr1-dbg for the APR symbols at least | 00:16 | |
paulproteus | Right-o. | 00:16 |
nkinkade | It was only coincidence that I logged in, and I noticed it was slugging in giving me a prompt so I top'd the thing. | 00:16 |
paulproteus | And I never passed it a signal. I think that was a red herring. | 00:17 |
nathany | nkinkade: yes, AFAIK that's all on a5 | 00:17 |
nathany | you can look at the apache conf files to be certain | 00:17 |
nathany | (sorry for the delay, i was away from my desk and then reading the backlog to figure out what was going on) | 00:17 |
nkinkade | nathany: It's possible that something with it or the scraper is causing these recent issues on a5. | 00:17 |
nathany | hrm | 00:17 |
paulproteus | nkinkade, You said it had huge memory usage? | 00:18 |
paulproteus | I did no forensics other than the stacktrace and the death. | 00:18 |
nkinkade | paulproteus: Yeah, it seemed to have most all of the available physical memory and was slowing eating up swap. | 00:18 |
nathany | oh, damn; i wonder if i ever implemented the guard to keep the scraper from endlessly scraping the entire internet | 00:18 |
nkinkade | haha | 00:18 |
nkinkade | We're all about the metadata. | 00:19 |
paulproteus | meat data | 00:19 |
nkinkade | And by god, we'll scrape the whole 'net if that's what it takes. | 00:19 |
nkinkade | :-) | 00:19 |
nathany | lol | 00:19 |
nkinkade | I would be super happy if it were something like that. | 00:19 |
nkinkade | Because a5 has been running really nicely since the RAM upgrade (and a7 too). | 00:20 |
nkinkade | Until the past week or so, that is. | 00:20 |
nkinkade | nathany: Were you serious when you said that, or was it a joke? | 00:23 |
nathany | nkinkade: unfortunately serious | 00:23 |
nathany | and yeah, just looked and it's possible to construct a scraper honeypot | 00:23 |
nathany | that would keep it busy busy busy | 00:23 |
nkinkade | nathany: Do you think you'll have time to fix that soon? | 00:24 |
nkinkade | It seems pretty urgent. | 00:24 |
nathany | i'm opening a bug now; need to go home and walk the dog but i can look at it this evening | 00:24 |
nathany | should be a straight-forward fix | 00:24 |
nkinkade | If that's even the real problem ... | 00:24 |
nkinkade | Hopefully that could be it. | 00:24 |
nathany | :) | 00:25 |
nathany | http://code.creativecommons.org/issues/issue116 | 00:25 |
nathany | feel free to nosy yourself if you want to follow along at home | 00:25 |
paulproteus | nkinkade, Well, crisis averted. | 00:26 |
paulproteus | The thing is, though, the mod_wsgi part of Apache isn't actually the scraper. | 00:26 |
paulproteus | That's a totally separate Python process. | 00:26 |
paulproteus | You'd think the Python process would have been the one to blow up and use all the RAM. | 00:27 |
paulproteus | Did you check if the Apache2 process, or instead some Python process, was using up the memory? | 00:27 |
paulproteus | Also, when I run top on a5, I see all the processes. | 00:28 |
nkinkade | paulproteus: I'm not actually sure. I usually shift-m top to sort by memory, but I didn't do that this time | 00:28 |
paulproteus | That feels reallys trange. | 00:28 |
nkinkade | What is your screen resolution? | 00:28 |
* paulproteus nods | 00:28 | |
paulproteus | 1600x1200 | 00:28 |
paulproteus | But there are a few blank lines at the bottom. | 00:28 |
nkinkade | There are only 90 processes, so that seems reasonable. | 00:29 |
paulproteus | Yeah, I just find that surprisingly few. | 00:29 |
nkinkade | paulproteus: Thanks for lending me a hand. The gdb trick is a good one for me to know. | 00:29 |
paulproteus | It's pure evil. | 00:29 |
nkinkade | The backtrace seems to have been useful, the call exit(0) I couldn't say. | 00:30 |
paulproteus | http://paulproteus.acm.jhu.edu/top-on-a5.png | 00:31 |
nkinkade | But a5 doesn't run very much ... it's more or less dedicated to creativecommons.org and ns1 | 00:31 |
paulproteus | Ya. | 00:31 |
nkinkade | Omg, why is portmap running? | 00:31 |
paulproteus | Time for sudo apt-get remove nfs-common. | 00:32 |
* paulproteus beats you to it | 00:32 | |
nkinkade | purge? :-) | 00:32 |
nkinkade | How in heck did nfs-common get in there? | 00:32 |
paulproteus | Eh, just remove (and I also removed portmap itself). | 00:32 |
paulproteus | I think it's part of the Debian default. | 00:32 |
paulproteus | Since it's somewhat non-obvious that without it, you can't use NFS. | 00:33 |
nkinkade | I almost always go through and remove nfs-commons and I sure I had done it for a5 and all the others. | 00:33 |
paulproteus | You might like the package cssh. | 00:33 |
nkinkade | Maybe it went in as dep to some othe rpackage. | 00:33 |
paulproteus | Nope, I don't think so. | 00:34 |
nkinkade | NFS is in the default install but not less! | 00:34 |
paulproteus | Because it's obvious what to do when less says command not found. | 00:34 |
paulproteus | It's less obvious what to do when you try mount a remote filesystem and mount hangs. | 00:34 |
nkinkade | But do you think NFS is so utilized by everyone that it must be in the default install? | 00:35 |
nkinkade | I haven't used NFS in a long time. | 00:35 |
*** stevel_ has joined #cc | 00:35 | |
nkinkade | Though I know you have. | 00:35 |
paulproteus | Heh. | 00:35 |
paulproteus | But still, if mount hangs, that's confusing. | 00:35 |
nkinkade | That's reasonable. | 00:35 |
paulproteus | It doesn't give an error message like, "WTF install portmap." | 00:36 |
paulproteus | Which it probably should. | 00:36 |
nkinkade | paulproteus: How insecure do you think it would be to set ionice setuid? | 00:38 |
nkinkade | I imagine it's a fairly simple program, so it seems like maybe it wouldn't be too bad, but maybe I'm missing something. | 00:39 |
paulproteus | It can fork a shell. | 00:39 |
paulproteus | Setuid ionice is equivalent to setuid bash. | 00:39 |
paulproteus | Why do you want that? | 00:39 |
paulproteus | To avoid installing ionicer? | 00:40 |
paulproteus | The more sensible thing to do apparently is to adjust the security capability preferences, says a friend of mine, so any program can adjust any other program's IO niceness. | 00:40 |
paulproteus | That's still not so great, which is why I still think ionicer is nicer. | 00:40 |
nkinkade | I would like to ionice the mysqldumps and the subsequent tarring of them, but I don't want to have to divert mysqldump and tar. | 00:41 |
paulproteus | Isn't that done by cron? | 00:41 |
nkinkade | This one is done remotely via SSH. | 00:41 |
paulproteus | Oh, that's even easier in a way. | 00:41 |
paulproteus | Is it done by SSH as root? | 00:42 |
nkinkade | Nope. | 00:42 |
nkinkade | As everett. | 00:42 |
paulproteus | Ah-hah. | 00:42 |
paulproteus | You could make everett's shell /usr/bin/ionice! | 00:42 |
paulproteus | Er, wait, same problem. | 00:42 |
paulproteus | In his .bashrc, add ionicer. | 00:42 |
nkinkade | What I like about the current way of doing the mysql backups is that it doesn't require any config on the remote machines. | 00:42 |
paulproteus | Oh, right. | 00:42 |
nkinkade | Just one script on the backup machine and ta-da. | 00:43 |
paulproteus | Except the creation of the user. | 00:43 |
paulproteus | And the addition of the SSH key presumably. | 00:43 |
nkinkade | Yeah, one thing. | 00:43 |
paulproteus | But I agree, that's not much configuration. | 00:43 |
nkinkade | Okay, two things. | 00:43 |
nkinkade | :-) | 00:43 |
nkinkade | Maybe I'll try your suggestion of adding ionicer to .bashrc. | 00:44 |
paulproteus | Still, I see what you mean. | 00:44 |
nkinkade | He doesn't do anything that I wouldn't want to be ioniced. | 00:44 |
nkinkade | What I'm confused about is why these machines are so sensitive like this. | 00:45 |
paulproteus | We could modify ionicer so that it did fork a process, I would just have to be more careful about its security. | 00:45 |
nkinkade | (to disk I/O) | 00:45 |
nkinkade | You mean like: | 00:45 |
nkinkade | ionicer somecommand --some-option | 00:46 |
paulproteus | Which is how ionice works in the first place. | 00:47 |
paulproteus | It'd be nice (!) to make a /bin/nicebash shell. | 00:47 |
paulproteus | But that's somewhat stupidly special-purpose. | 00:47 |
nkinkade | Right, and this is why I thought about setuid'ing ionice, but clearly I didn't think it through! | 00:48 |
paulproteus | I'm going to think about other things, like this BayPiggies talk I'm giving soon. | 00:48 |
paulproteus | Also I do like the window manager called awesome so far. | 00:49 |
*** nathany has quit IRC | 00:51 | |
paulproteus | Can't get there from here! | 00:51 |
*** stevel has quit IRC | 00:52 | |
*** TimStarling has joined #cc | 01:10 | |
*** stevel_ has quit IRC | 01:25 | |
*** tanjir has quit IRC | 01:25 | |
*** tanjir has joined #cc | 01:26 | |
*** Bovinity has quit IRC | 01:34 | |
*** iamiamiam has joined #cc | 01:35 | |
iamiamiam | ??? | 01:36 |
iamiamiam | somebody??? | 01:36 |
iamiamiam | i need help!!!!! | 01:37 |
*** iamiamiam has left #cc | 01:38 | |
greg-g | call 911? | 01:39 |
*** mlinksva has quit IRC | 02:03 | |
*** everton137 has joined #cc | 02:51 | |
*** everton137 has quit IRC | 03:06 | |
*** nkinkade has left #cc | 03:09 | |
*** Danny_B has quit IRC | 05:11 | |
*** mlinksva has joined #cc | 05:17 | |
*** Danny_B has joined #cc | 05:18 | |
*** tanjir has quit IRC | 06:18 | |
*** jgay has joined #cc | 06:40 | |
*** mlinksva has quit IRC | 06:56 | |
*** TimStarling has left #cc | 08:38 | |
*** UncleCJ2_ has quit IRC | 09:16 | |
*** UncleCJ2_ has joined #cc | 09:17 | |
*** grahl has joined #cc | 11:49 | |
*** jgay has quit IRC | 13:26 | |
*** zer0 has joined #cc | 13:36 | |
zer0 | can i get cc here?////// | 13:39 |
zer0 | ? | 13:39 |
*** zer0 has quit IRC | 13:45 | |
*** Ekushey has joined #cc | 14:31 | |
*** mlinksva has joined #cc | 14:55 | |
*** grahl has quit IRC | 15:17 | |
*** UncleCJ2_ has quit IRC | 15:25 | |
*** jgay has joined #cc | 15:37 | |
*** nkinkade has joined #cc | 15:44 | |
*** Ekushey- has joined #cc | 16:09 | |
*** Ekushey has quit IRC | 16:21 | |
*** Ekushey- has quit IRC | 16:46 | |
*** nathany has joined #cc | 16:52 | |
*** stevel has joined #cc | 16:54 | |
*** m3cr3d1s has joined #cc | 17:00 | |
m3cr3d1s | You'll all be happy to know I added a white background to the CC icon on twitter | 17:07 |
*** cacimar has joined #cc | 17:32 | |
*** cacimar has quit IRC | 17:33 | |
*** cacimar has joined #cc | 17:43 | |
*** Bovinity has joined #cc | 17:48 | |
*** mecredis_ has joined #cc | 18:01 | |
*** m3cr3d1s has quit IRC | 18:03 | |
*** stevel has quit IRC | 18:04 | |
*** stevel has joined #cc | 18:04 | |
*** balor has joined #cc | 18:49 | |
paulproteus | mecredis, Is a white background better than a transparent one? | 19:15 |
Bovinity | O_o | 19:18 |
*** Prog66 has joined #cc | 19:26 | |
*** Prog66 has left #cc | 19:26 | |
*** m3cr3d1s has joined #cc | 19:29 | |
*** rohitj has quit IRC | 19:30 | |
*** mecredis_ has quit IRC | 19:34 | |
*** mecredis_ has joined #cc | 19:52 | |
*** m3cr3d1s has quit IRC | 19:53 | |
Bovinity | oh, my, god, http://www.penny-lane.com/sk/Colclough%20-%20Please%20Do%20Not%20Succumb%20to%20Greed%20(memix).mp3 | 20:45 |
nkinkade | That's awesome! | 20:49 |
nkinkade | I guess Cobert will be very upset, and maybe even litigious! | 20:49 |
nkinkade | (not likely) | 20:49 |
*** cacimar has quit IRC | 21:07 | |
mecredis_ | hah | 21:13 |
paulproteus | nathany, So you recorded the video separately from the audio, and then mux'd them later. Do you recall with what? | 21:14 |
paulproteus | (And it seems I'll use gtk-recordmydesktop for the video recording.) | 21:15 |
nathany | paulproteus: gtk-recordmydesktop sounds familiar | 21:16 |
nathany | i used a command line tool of your suggestion | 21:16 |
paulproteus | Oh, right, I asked you this before. | 21:16 |
* paulproteus chuckles and sighs. | 21:16 | |
nkinkade | Breasts and domains names, a really sensible combination: http://imagesak.godaddy.com/gdtv/bkg/bkg_tv_video_candice.jpg | 21:34 |
paulproteus | diveintobreasts.org? | 21:49 |
nathany | W. T. F.? i cam in at the wrong time | 21:51 |
nathany | came | 21:51 |
nathany | LOL | 21:51 |
Bovinity | oh boy... | 21:51 |
nathany | wow | 21:51 |
nathany | seems like a ridiculous combination to *ME* ;) | 21:52 |
*** pktck has joined #cc | 22:03 | |
*** balor has quit IRC | 22:07 | |
*** pktck has quit IRC | 22:11 | |
*** pktck has joined #cc | 22:11 | |
*** pktck_ has joined #cc | 22:21 | |
*** pktck__ has joined #cc | 22:26 | |
*** pktck_ has quit IRC | 22:26 | |
*** pktck has quit IRC | 22:30 | |
nkinkade | nathany: I see you just committed some changes to the scraper. Does that mean that the potential issue with the scraper should be fixed? | 22:40 |
nathany | i need to test + update, but possible | 22:40 |
nkinkade | Just want to know where things stand in case any problems crop up with a5 again. | 22:40 |
nathany | possibly | 22:40 |
*** pktck__ has quit IRC | 23:06 | |
*** pktck has joined #cc | 23:06 | |
*** nathany has quit IRC | 23:15 | |
*** tim_hwang has joined #cc | 23:28 | |
*** pktck has quit IRC | 23:28 | |
*** pktck has joined #cc | 23:28 | |
Bovinity | nkinkade: ping | 23:29 |
nkinkade | Bovinity: Hi. | 23:31 |
Bovinity | nkinkade: how do i add an email address to the photocopier address book? | 23:32 |
Bovinity | specifically, mine | 23:32 |
nkinkade | Bovinity: One second. | 23:33 |
*** rohitj has joined #cc | 23:37 | |
nkinkade | Bovinity: Did you figure it out? | 23:52 |
nkinkade | You have to press the login button then enter a really secret password: 11111 | 23:53 |
Bovinity | login button? | 23:53 |
nkinkade | At that point you have System and User settings. I can't remember where the address book is. | 23:53 |
nkinkade | There is a button at the top right of the LCD display. | 23:54 |
nkinkade | I don't know what it says, but it should be something that is semi-indicative of logging in or admin'ing the thing. | 23:54 |
*** mecredis_ has quit IRC | 23:55 | |
nkinkade | Once you get into the right settings area, there is an address book option. | 23:55 |
Bovinity | yeah, got it ;) | 23:56 |
Bovinity | i didn't notice the login button on the panel | 23:56 |
nkinkade | Bovinity: Did you get your address entered okay? | 23:58 |
Bovinity | nkinkade: yep, thx | 23:58 |
Generated by irclog2html.py 2.6 by Marius Gedminas - find it at mg.pov.lt!