Well, the blank screen is probably due to a pipe programming timing problem. IGD devices have been getting less & less sensitive to it over time (830s would hang if you looked at them the wrong way) but it's still not that hard to mis-program them. Can you narrow down the problem at all? Did it occur just after a kernel upgrade? Or did you also upgrade your X driver? There haven't been any real changes in intelfb since Oct. when Krzysztof added interlaced mode support, The framebuffer drivers provide a simple interface to applications wanting to draw to the screen. It's usually used in embedded devices and for boot splash screens. Generally you don't want to run X through the framebuffer I think the only thing you'll lose is the boot splash screen. But you could also try using vesafb. It works with more hardware than intelfb (including laptops) and we've fixed some vesafb related bugs in the X driver recently, so it may work better for you. Jesse --
Not that I can think of. Other than a custom kernel I use standard Ubuntu 7.10. I was testing out the kernel starting with 2.6.23-rc4 and I've tested each rc, recompiling when a new one came out (except for rc1 and rc2, they didn't compile). So, when 2.6.25-rc3 came out I recompiled and came across this error. I've removed all fb devices and the blank screen _almost_ never happens (also, I _don't_ have to disable the splash screen). There is one _big_exception_ the problem still happens when I'm at my University (and _very_ consistently). There, I have to disable the splash screen, and that ~mostly~ fixes the problem. I can't believe it could be related, but when I boot the computer at the Univ. I get a flood of kernel messages related to the wireless all the time, but still connect: > kernel: wlan0_rename: RX too short data frame payload _Could_ a wireless problem be related to a graphics's problem? I wouldn't think so? But, it would explain why I get the blank screen mostly when I get the wireless problem. But, then again it could be Well, if the fb is not necessary, then I think I'll just leave it out. Unless you think using the vesafb would help. But, I thought that the X server doesn't interact with the fb driver. Maybe it's not related to the fb but the other intel driver: CONFIG_AGP=y CONFIG_AGP_INTEL=y CONFIG_DRM=y CONFIG_DRM_I915=m I've updated my config if you need to see the newest version: http://jdserver.homelinux.org/linux/config-2.6.25-rc5 Justin --
It would be interesting if you could get register dumps at a couple of different points, using the intel_reg_dumper tool in git://git.freedesktop.org/git/xorg/driver/xf86-video-intel (in src/reg_dumper). You'll probably have to modify your boot scripts though. It would be good to get them: - at startup time before the splash screen - sometime while the splash screen is running - after X starts and you see the blank screen Since I think this is actually an X driver bug, can you file a bug at Well, it's possible that the wireless issue is affecting the pipe programming timing enough to expose the bug (stuff like this is usually a problem with either not waiting for a register program to take effect and/or programming Yeah, looks like it's off. Ubuntu may be falling back to using the X vesa driver or something though... Jesse --
I couldn't get it to compile. What am I suppose to do? I ran autogen.sh in the top level dir and it gave the error message: ./configure: line 20486: syntax error near unexpected token `XINERMA,' ./configure: line 20486: `XORG_DRIVER_CHECK_EXT(XINERMA, xineramaproto)' Then I also tried to run make and make -f Makefile.in (in the reg_dumper dir and top level dir), and it said: Makefile.in:15 *** missing separator. Stop. That part of Makefile.in contains: @SET_MAKE@ VPATH = @srcdir@ What am I doing wrong? You'll have to forgive me, I'm not that How exactly would I do that? I would think I could just add command at the end of 3 files in /etc/init.d. Would that work? It should run my command when it starts that script and gets to the end. Or do boot scripts exit in the middle of the script, which would prevent my command I'm not sure. I just have a home wireless router (Hawking HWR54G). I have no idea what the Uni. router is. On both, I connect to an unencrypted network, but at the Uni. it redirects to a login page where I have to log on before I can actually use the Internet. Is there anyway I could get more information about the Uni. router? Justin --
[Adding Bryce to cc list, he may have a copy of intel_reg_dumper already built for Ubuntu. Bryce, please see below for a few more questions, thanks.] This usually means you don't have the xorg autoconf macros installed. Your No problem, building X still isn't quite as easy as it should be, but it's only slightly more complicated than the typical './configure;make;make install' due to the dependencies between packages. You can check out the X Hm, I haven't looked at the Ubuntu scripts before. I know they're using upstart, but if they haven't divered too much from the old style init, you may be able to modify rc.sysinit to get the boot time register dump. For the splash screen dump, you'd just have to add the intel_reg_dumper command to one of the other init scripts that runs while the splash screen is up (maybe the HAL daemon script or something). Once your screen is blank, it's probably easiest to ssh into your machine and capture the dump that way. Probably, though the messages in your log from your Univ. connection may be enough for the networking guys to figure things out. It's probably best to track the wireless thing as a separate issue though, I'd recommend mailing linux-wireless@vger.kernel.org with the problem. Thanks, Jesse --
The following command will pull in all the dependencies you need for building -intel: If you're having troubles while the splash screen is displayed, you may want to look at / fiddle with the settings in /etc/gdm/gdm.conf. Also have a look in /etc/gdm/, where the Init, PreSession, and PostSession scripts are located. Those are additional places where scripts can be tied in during this early startup phase. (I don't know if that would actually be of use here though.) Bryce --
I'm just compiling the intel_reg_dumper right? I don't have to recompile all of X to use it? Ok, got that. I tried to compile and I found out I also had to install libpciaccess-dev, that's probably because I'm using the git tree and not the one that Ubuntu 7.10 uses. But I still can't compile the intel_reg_dumper I get the following error: intel-driver/src/reg_dumper$ make gcc -DHAVE_CONFIG_H -I. -I../.. -Wall -Wpointer-arith -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Wnested-externs -fno-strict-aliasing -I./.. -DREG_DUMPER -g -O2 -MT main.o -MD -MP -MF .deps/main.Tpo -c -o main.o main.c main.c: In function ‘main’: main.c:72: warning: implicit declaration of function ‘pci_device_map_range’ main.c:72: warning: nested extern declaration of ‘pci_device_map_range’ main.c:75: error: ‘PCI_DEV_MAP_FLAG_WRITABLE’ undeclared (first use in this function) main.c:75: error: (Each undeclared identifier is reported only once main.c:75: error: for each function it appears in.) make: *** [main.o] Error 1 Justin --
<jbarnes> bryce: would it be possible for you to include intel_reg_dumper in your intel driver pkg? jesse - I looked into this, and in fact debian has already enabled this in the 2.2.1 driver, which we are carrying in Hardy. Timo had to disable it though since we currently carry libpciaccess in universe, not main. However, I note that even after installing libpciaccess, it still fails Yes, the same error occurs with the 2.2.1 driver we have in Hardy right now: bryce@chideok:~/src/xserver-xorg-video-intel/xserver-xorg-video-intel-2.2.1-build/src/reg_dumper$ make gcc -DHAVE_CONFIG_H -I. -I../.. -Wall -Wpointer-arith -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Wnested-externs -fno-strict-aliasing -I./.. -DREG_DUMPER -g -O2 -MT intel_reg_dumper-main.o -MD -MP -MF .deps/intel_reg_dumper-main.Tpo -c -o intel_reg_dumper-main.o `test -f 'main.c' || echo './'`main.c main.c: In function 'main': main.c:72: warning: implicit declaration of function 'pci_device_map_range' main.c:72: warning: nested extern declaration of 'pci_device_map_range' main.c:75: error: 'PCI_DEV_MAP_FLAG_WRITABLE' undeclared (first use in this function) main.c:75: error: (Each undeclared identifier is reported only once main.c:75: error: for each function it appears in.) make: *** [intel_reg_dumper-main.o] Error 1 I could get it to build by ifdefing out th pci_device_map_range call, and then adding -lpciaccess to the linker, however it segfaults without that call. Does LIBPCIACCESS require xserver 1.4? (We have 1.3 in Hardy). Bryce --
No, libpciaccess should be standalone, but the Intel driver requires 0.10 or better... Which version are you building against? --
I've put in a sync request for this to get updated for hardy. (It still will only be in universe, so won't be built automatically with -intel, but at least users will be able to more easily compile it by hand.) Meanwhile, here is a Ubuntu Hardy build of Debian's package: http://people.ubuntu.com/~bryce/Testing/libpciaccess/libpciaccess-dev_0.10-1_i386.deb http://people.ubuntu.com/~bryce/Testing/libpciaccess/libpciaccess0_0.10-1_i386.deb Bryce --
Thanks a lot Bryce, hopefully this will help Justin build the register dumper so we can track down his problem... Jesse --
Thanks for all your help so far! Well I tried to install the deb files that Bryce created, unfortunately they're dependent on a newer version of libc6. I think (_hope_) this is the one in Hardy _because_ that's what I'm doing right now; I've decided to upgrade to Hardy. (so sad because I have to download _1.5GB_) Jesse, do you still think it's a bug in the X server driver instead of the kernel driver (dri I guess)? Because I can trigger the blank screen before the X server even starts. If I press ctr+alt+f1 at the _right_ time when the kernel is booting (or I guess when the boot splash is showing) it will instantly blank out. That doesn't seem like something related to X. Anyways maybe it's multiple things, and by upgrading to Hardy it might fix it (which would prove that it's not a kernel bug, but some software version dependency bug - which would still be good to know). Justin --
Depends on the boot splash program. I think in some configurations it'll be X Good luck with the upgrade... Jesse --
Mmm, well indeed that may be the kernel framebuffer.
Justin, one thing you could try is to set your system to boot up without
starting gdm. Either do this via /etc/X11/default-display-manager
(comment it out, or set to xdm or something) or bypass it entirely by
mv /etc/rc3.d/{S30,K30}gdm, and then append '3' to the end of the boot line
in grub. (Maybe there's a better way to achieve this with upstart, but
my upstart-fu is limited.)
Bryce
--
Well, the upgrade went ok, and compiling the reg_dumper using the libpciaccess .deb from Bryce worked. Then I tried to add to the boot scripts a call to reg_dumper... ...To make a long story short... I somehow killed my boot scripts! Anyways, I did a fresh reinstall of Ubuntu 8.4 Beta. I'm still getting the blank screen problem with the 2.6.25-rc6 kernel, so I guess it wasn't a Ubuntu software problem (or I hope not, because that could be really hard to find). What I did was created a script that took a reg_dump every 6 secs for 1 min. I made that as rc2.d/S01regdump so it should've been the very first thing called. So, I hope there's enough "data points" to see what's happening. Reg Dump Information http://jdserver.homelinux.org/linux/reg_dump.tar.bz2 Detailed System Information http://jdserver.homelinux.org/linux/sysinfo-2.6.25-rc6 Kernel Config http://jdserver.homelinux.org/linux/config-2.6.25-rc6 Hope that can help find the problem. If you need me to test anything else I'll try. Justin --
Wow, that's a lot of dump files. :) I was worried that in the "blank" case we may see the same register dump as in the working case, but thankfully they're different. In fact, in all the dumps after 0 in the 2.6.25-blank case, both pipes are disabled and the LCD itself is disabled. The important bits: @@ -24,7 +24,7 @@ (II): DVOB_SRCDIM: 0x00000000 (II): DVOC_SRCDIM: 0x00000000 (II): PP_CONTROL: 0x00000001 (power target: on) -(II): PP_STATUS: 0xc0000008 (on, ready, sequencing idle) +(II): PP_STATUS: 0x0000000a (off, not ready, sequencing idle) (II): PFIT_CONTROL: 0x80002668 (II): PFIT_PGM_RATIOS: 0x00000000 (II): PORT_HOTPLUG_EN: 0x00000020 @@ -36,7 +36,7 @@ (II): DSPABASE: 0x00000000 (II): DSPASURF: 0x00000000 (II): DSPATILEOFF: 0x00000000 -(II): PIPEACONF: 0x00000000 (disabled, single-wide) +(II): PIPEACONF: 0x000c0000 (disabled, single-wide) (II): PIPEASRC: 0x027f01df (640, 480) (II): PIPEASTAT: 0x80000203 (status: FIFO_UNDERRUN VSYNC_INT_STATUS VBLANK_INT_STATUS OREG_UPDATE_STATUS) (II): FBC_CFB_BASE: 0x00000000 @@ -59,16 +59,16 @@ (II): VSYNC_A: 0x01eb01e9 (490 start, 492 end) (II): BCLRPAT_A: 0x00000000 (II): VSYNCSHIFT_A: 0x00000000 -(II): DSPBCNTR: 0x95000000 (enabled, pipe B) +(II): DSPBCNTR: 0x15000000 (disabled, pipe B) (II): DSPBSTRIDE: 0x00000500 (1280 bytes) (II): DSPBPOS: 0x00000000 (0, 0) (II): DSPBSIZE: 0x01df027f (640, 480) (II): DSPBBASE: 0x00000000 (II): DSPBSURF: 0x00000000 (II): DSPBTILEOFF: 0x00000000 -(II): PIPEBCONF: 0x80000000 (enabled, single-wide) +(II): PIPEBCONF: 0x000c0000 (disabled, single-wide) (II): PIPEBSRC: 0x027f01df (640, 480) -(II): PIPEBSTAT: 0x00000202 (status: VSYNC_INT_STATUS ...
Ok, I have the X logs: http://jdserver.homelinux.org/linux/Xorg.0.log-blank http://jdserver.homelinux.org/linux/Xorg.0.log-good Below is just a portion of the diff of those files. --- Xorg.0.log-blank 2008-03-26 11:14:12.000000000 -0700 +++ Xorg.0.log-good 2008-03-26 11:14:35.000000000 -0700 @@ -20,7 +20,7 @@ Markers: (--) probed, (**) from config file, (==) default setting, (++) from command line, (!!) notice, (II) informational, (WW) warning, (EE) error, (NI) not implemented, (??) unknown. -(==) Log file: "/var/log/Xorg.0.log", Time: Wed Mar 26 09:59:56 2008 +(==) Log file: "/var/log/Xorg.0.log", Time: Wed Mar 26 10:02:12 2008 (==) Using config file: "/etc/X11/xorg.conf" (==) ServerLayout "Default Layout" (**) |-->Screen "Default Screen" (0) @@ -470,9 +470,9 @@ (WW) intel(0): Register 0x61200 (PP_STATUS) changed from 0xc0000008 to 0xd0000009 (WW) intel(0): PP_STATUS before: on, ready, sequencing idle (WW) intel(0): PP_STATUS after: on, ready, sequencing on -(WW) intel(0): Register 0x71024 (PIPEBSTAT) changed from 0x00000202 to 0x00000242 -(WW) intel(0): PIPEBSTAT before: status: VSYNC_INT_STATUS VBLANK_INT_STATUS -(WW) intel(0): PIPEBSTAT after: status: VSYNC_INT_STATUS LBLC_EVENT_STATUS VBLANK_INT_STATUS +(WW) intel(0): Register 0x71024 (PIPEBSTAT) changed from 0x80000202 to 0x80000242 +(WW) intel(0): PIPEBSTAT before: status: FIFO_UNDERRUN VSYNC_INT_STATUS VBLANK_INT_STATUS +(WW) intel(0): PIPEBSTAT after: status: FIFO_UNDERRUN VSYNC_INT_STATUS LBLC_EVENT_STATUS VBLANK_INT_STATUS (WW) intel(0): Register 0x68000 (TV_CTL) changed from 0x10000000 to 0x000c0000 (WW) intel(0): Register 0x68010 (TV_CSC_Y) changed from 0x00000000 to 0x0332012d (WW) intel(0): Register 0x68014 (TV_CSC_Y2) changed from 0x00000000 to 0x07d30104 @@ -735,11 +735,73 @@ (II) intel(0): fbc disabled on plane a (II) intel(0): fbc disabled on plane a (II) intel(0): fbc disabled on plane a -(II) intel(0): xf86UnbindGARTMemory: unbind key 0 -(II) intel(0): ...
Well, I disabled gdm and tryed to trigger the blank screen. I did about ~16 reboots and it blanked out on me only 2 times. I was using a script to reboot and/or startx. One time I'm not exactly sure if X was started, so it might have blanked out on a "startx". The other time I'm fairly sure X wasn't started, so it blanked out on a terminal login. Both of the blank outs were different than the ones with gdm started. Pressing ctrl+alt+f# changed nothing on the screen; the screen seemed almost completely off (no or little backlight). A few seconds after pressing the power button the shutdown splash screen would show, but this time it was _very_ faint. Usually, when gdm is enabled, pressing ctrl+alt+f# would "refresh" (or mode/resolution change) the screen, but it would still be blank. Also the backlight still seamed to be on and at full brightness (although, still displaying black). Well, I don't know what to say, it's the strangest of problems. Justin --
Yeah, seems pretty weird. Given that you see it w/o the fb stuff loaded as well and we still have a few open bugs against the intel X driver regarding VT switch & mode programming, I don't think this is a real kernel regression. It's more likely that some timing or memory layout changed subtly and is causing to to hit one of our existing bugs more frequently that you did before. Can you file a bug against the intel X driver at bugs.freedesktop.org so we can track it there? Unless we can find a way to reproduce it reliably it'll probably take a long time to fix, but we don't want to lose it either... Thanks, Jesse --
Well, I'll file a bug on bugs.freedesktop.org, but if you don't think it's a kernel regression then I'll wait until the final release of 2.6.25 comes out (unless you _really_ need me to file it sooner). I still think it's somehow related to something that changed in the kernel from v24 to v25 because I've never had it happen with a kernel version less that 2.6.25. You say it's a timing issue; I've searched and found two things that have changed in v25: Preemptive RCU and I/O Port Delay. I've enabled both preemptive RCU and no I/O port delay. I've recompiled with both disabled and found that the blank screen _still_ happens. So, I'm figuring that _maybe_ by adding these options the kernel developers needed to change something that exposes something related to the intel X.org driver that's no longer necessarily true (or something like that - do you get what I'm trying to say). Or is there another X.org intel driver? And if so how are they (agp/drm/X.org) related? Justin --
Well, given what we've tested so far, it really doesn't seem like a framebuffer layer regression nor a DRM regression... I suppose you could try bisecting, but given that the problem doesn't happen everytime that might You could try using the X vesa driver instead of the intel driver... Jesse --
Ok, so I tried with the vesa X.org driver and it never had the blank screen problem (so far). There were some screen distortions on mode change but no blank screen. I noticed that when using the vesa X.org driver the i915 kernel module didn't load. I think we're getting closer! It seems like it's not a kernel regression but a bug in the intel Xorg driver. Or maybe an interaction between the i915 DRM kernel module (>=2.6.25) and the intel X.org driver, because it only happens with the 2.6.25 kernel, and I'd think if it was only an X driver bug, then it should've happened on other kernel versions. Justin --
Looks like we still are on 0.8. I can check into upping us to 0.10 this afternoon or tomorrow - even though it's still in universe and we can't include it yet by default for -intel, that'll make it simpler for users that need to build it. Stay tuned. Bryce --
