Spreadsheet help

"Please leave a message at the beep, we will get back to you when your support contract expires."

Moderators: phlip, Moderators General, Prelates

User avatar
Angua
Don't call her Delphine.
Posts: 5774
Joined: Tue Sep 16, 2008 12:42 pm UTC
Location: UK/[St. Kitts and] Nevis Occasionally, I migrate to the US for a bit

Spreadsheet help

Postby Angua » Thu Apr 23, 2015 12:29 pm UTC

I've just been given some data to try to make a project out of. It's masses of information on two different spreadsheets. While they both have the same unique identifiers, I need someway of weeding out the data that doesn't appear on both sheets as I am supposed to be trying to correlate data b with data a. Is there a way to do this easily? I have access to excel and open office.
Crabtree's bludgeon: “no set of mutually inconsistent observations can exist for which some human intellect cannot conceive a coherent explanation, however complicated”
GNU Terry Pratchett

User avatar
Whizbang
The Best Reporter
Posts: 2238
Joined: Fri Apr 06, 2012 7:50 pm UTC
Location: New Hampshire, USA

Re: Spreadsheet help

Postby Whizbang » Thu Apr 23, 2015 12:39 pm UTC

Add a column to Sheet1 and fill in this formula to the first row (the formula assumes you have headers).

Code: Select all

=COUNTIF(Sheet2!A:A,A2)


Then copy down.

Add a column to Sheet2 and fill in this formula to the first row.

Code: Select all

=COUNTIF(Sheet1!A:A,A2)


Then copy down.

Anything that is greater than 0 has a match on the other sheet.

Drumheller769
Posts: 732
Joined: Mon May 11, 2009 7:46 pm UTC
Location: ♞♞♞

Re: Spreadsheet help

Postby Drumheller769 » Thu Apr 23, 2015 12:48 pm UTC

A quick google search offers 2 sorta easy solutions, note I haven't tried either of them, just trying to help:

This method:
http://www.get-digital-help.com/2011/01 ... -in-excel/

or the match function:
http://stackoverflow.com/questions/1450 ... cell-value
The Great Hippo wrote:Arguing with the internet is a lot like arguing with a bullet. The internet's chief exports are cute kittens, porn, and Reasons Why You Are Completely Fucking Wrong.

User avatar
Quercus
Posts: 1735
Joined: Thu Sep 19, 2013 12:22 pm UTC
Location: London, UK
Contact:

Re: Spreadsheet help

Postby Quercus » Thu Apr 23, 2015 1:54 pm UTC

Whizbang's solution sounds good, but if your project is reasonably long-term and contains a large number of such operations it might be worth taking some time to learn basic python + pandas (a data analysis library that interfaces nicely with excel files). In my PhD I ended up with about 400,000 DNA sequencing reads spread over about 100 worksheets, and learning to code a bit was the best decision I ever made in that project - I think I would still be crunching data now if I hadn't been able to make some scripts to automate the analysis.

Actually if this is a one off thing and you have trouble with getting excel to do what you want I could see if I could write a script for you (I am wary of making promises - I haven't coded anything in a few months, so I might be a bit rusty). It sounds like a simple inner join operation, which, if I reuse the I/O code from my PhD scripts, should literally be a single line of code. Let me know if you're interested. I'm pretty busy today and tomorrow, but I should definitely have time on Saturday.

Edit: Drumheller769's solutions are probably good too - I just haven't looked at those.

User avatar
Angua
Don't call her Delphine.
Posts: 5774
Joined: Tue Sep 16, 2008 12:42 pm UTC
Location: UK/[St. Kitts and] Nevis Occasionally, I migrate to the US for a bit

Re: Spreadsheet help

Postby Angua » Thu Apr 23, 2015 3:26 pm UTC

Thanks, this is going to be a one off thing for now, but I'll probably look into it later as hopefully will be doing similar stuff next year for my 4month public health job.

How easy is it to write a script to get split info that for some reason has all been written into one cell? Probably impossible as I suspect it's all been hand typed in, so not the most consistent of information making :P
Crabtree's bludgeon: “no set of mutually inconsistent observations can exist for which some human intellect cannot conceive a coherent explanation, however complicated”
GNU Terry Pratchett

User avatar
Whizbang
The Best Reporter
Posts: 2238
Joined: Fri Apr 06, 2012 7:50 pm UTC
Location: New Hampshire, USA

Re: Spreadsheet help

Postby Whizbang » Thu Apr 23, 2015 3:28 pm UTC

Look into Data - Text to Columns.

https://support.microsoft.com/en-us/kb/214261

User avatar
Angua
Don't call her Delphine.
Posts: 5774
Joined: Tue Sep 16, 2008 12:42 pm UTC
Location: UK/[St. Kitts and] Nevis Occasionally, I migrate to the US for a bit

Re: Spreadsheet help

Postby Angua » Thu Apr 23, 2015 3:34 pm UTC

Thanks! Half the battle is knowing what what you want to do is called.
Crabtree's bludgeon: “no set of mutually inconsistent observations can exist for which some human intellect cannot conceive a coherent explanation, however complicated”
GNU Terry Pratchett

User avatar
Quercus
Posts: 1735
Joined: Thu Sep 19, 2013 12:22 pm UTC
Location: London, UK
Contact:

Re: Spreadsheet help

Postby Quercus » Thu Apr 23, 2015 4:02 pm UTC

Angua wrote:Thanks, this is going to be a one off thing for now, but I'll probably look into it later as hopefully will be doing similar stuff next year for my 4month public health job.


Cool, if you do want to learn that stuff later here's the resources I used. I'm don't know if they're the best ones, but they worked for me:

Python: Learn python the hard way coupled with a little bit of python for biologists
Pandas/data analysis: Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython

I also use anaconda as my python distribution because it's already set up for scientific work out of the box (and includes the spyder IDE, which is the best one I have found for semi-interactive data wrangling stuff).

Good luck with your project!

User avatar
Whizbang
The Best Reporter
Posts: 2238
Joined: Fri Apr 06, 2012 7:50 pm UTC
Location: New Hampshire, USA

Re: Spreadsheet help

Postby Whizbang » Thu Apr 23, 2015 4:05 pm UTC

Regarding Excel help, I use http://ExcelForum.com (I am Whizbang there too). I started out there asking questions, moved to answering them, now I am a lot less active, but the community there is full of a lot of very clever and experienced Excel super users. Excel, especially with VBA, is a lot more powerful and versatile than most people realize.

User avatar
Angua
Don't call her Delphine.
Posts: 5774
Joined: Tue Sep 16, 2008 12:42 pm UTC
Location: UK/[St. Kitts and] Nevis Occasionally, I migrate to the US for a bit

Re: Spreadsheet help

Postby Angua » Thu Apr 23, 2015 4:08 pm UTC

Whizbang's solution isn't working for some reason. When it brings up the dialogue box asking me which sheet to use, what do I put?

edit - figured it out - the second sheet name had a space in it. Working now I've removed the space.
Crabtree's bludgeon: “no set of mutually inconsistent observations can exist for which some human intellect cannot conceive a coherent explanation, however complicated”
GNU Terry Pratchett

User avatar
Whizbang
The Best Reporter
Posts: 2238
Joined: Fri Apr 06, 2012 7:50 pm UTC
Location: New Hampshire, USA

Re: Spreadsheet help

Postby Whizbang » Thu Apr 23, 2015 4:12 pm UTC

Are you doing Text-to-Columns? Can you post a screen shot or describe more fully?

Or are you doing the formulas in my first post? If the formulas, you need to change "Sheet1" and "Sheet2" to whatever sheet names you are using. If your sheet names include spaces, you need to use single quotes. (eg. 'My Sheet'!A:A)

User avatar
Angua
Don't call her Delphine.
Posts: 5774
Joined: Tue Sep 16, 2008 12:42 pm UTC
Location: UK/[St. Kitts and] Nevis Occasionally, I migrate to the US for a bit

Re: Spreadsheet help

Postby Angua » Thu Apr 23, 2015 4:18 pm UTC

Ok, so given data A is the most important (ie, need that one to do project), and i've found that there are actually 3 other sheets that I need to cross reference, my plan is to put whizbang's formula in the data A sheet with one column each for b-d, then have a column to add up the values b-d and get rid of any that are 0.

Sound reasonable, or am I missing something?
Crabtree's bludgeon: “no set of mutually inconsistent observations can exist for which some human intellect cannot conceive a coherent explanation, however complicated”
GNU Terry Pratchett

User avatar
Whizbang
The Best Reporter
Posts: 2238
Joined: Fri Apr 06, 2012 7:50 pm UTC
Location: New Hampshire, USA

Re: Spreadsheet help

Postby Whizbang » Thu Apr 23, 2015 4:51 pm UTC

Sounds reasonable. [edit]On second thought, it only sounds reasonable if you want to include all records that have matches in any sheet, regardless if it has matches in all.[/edit]

You can also do it in a single column using this:

Code: Select all

=Sumproduct(--(Sheet2!A:A=A2),--(Sheet3!A:A=A2),--(Sheet4!A:A=A2))


A zero means it is not found in at least one of the sheets.

Be aware that if you have many thousands of rows, this formula may not be the best solution, as it is processor intensive.

In which case use this (which is the same, processor-wise, as the multiple column method, but in a single column):

Code: Select all

=Sum(Countig(Sheet2!A:A,A2),Countif(Sheet3!A:A,A2),Countif(Sheet4!A:A,A2))

User avatar
Angua
Don't call her Delphine.
Posts: 5774
Joined: Tue Sep 16, 2008 12:42 pm UTC
Location: UK/[St. Kitts and] Nevis Occasionally, I migrate to the US for a bit

Re: Spreadsheet help

Postby Angua » Thu Apr 23, 2015 5:12 pm UTC

Thanks - I'd already done it my way. I figure it might be useful to have the data even if it isn't in everything just in case we're just looking at the correlation with one parameter.

I've got 5500 rows (nice round number cut down from just under 8000).

Now to figure out if I have any hope on the splitting cell into columns. Annoyingly, no commas, and just the use of caps to indicate a new thing.

Thanks so much for all your help.
Crabtree's bludgeon: “no set of mutually inconsistent observations can exist for which some human intellect cannot conceive a coherent explanation, however complicated”
GNU Terry Pratchett

User avatar
Angua
Don't call her Delphine.
Posts: 5774
Joined: Tue Sep 16, 2008 12:42 pm UTC
Location: UK/[St. Kitts and] Nevis Occasionally, I migrate to the US for a bit

Re: Spreadsheet help

Postby Angua » Thu Apr 23, 2015 5:33 pm UTC

My housemate is going to help me code it once I get my head around everything (he's currently doing a PhD in genetics, so says it will be really quick).

Hurray, because the text to cell wasn't working especially brilliantly.
Crabtree's bludgeon: “no set of mutually inconsistent observations can exist for which some human intellect cannot conceive a coherent explanation, however complicated”
GNU Terry Pratchett

User avatar
Angua
Don't call her Delphine.
Posts: 5774
Joined: Tue Sep 16, 2008 12:42 pm UTC
Location: UK/[St. Kitts and] Nevis Occasionally, I migrate to the US for a bit

Re: Spreadsheet help

Postby Angua » Fri Apr 24, 2015 9:43 am UTC

So, googling this problem seems to be failing me - I want to get it to alternate the row colour depending on the unique id (so that way all the same id is the same colour) to make it easier for me to read.

So far I've managed to get a second row that alternates between 1 and 0, but I can't for the life of me manage to get the conditional formatting to change the entire row based on that cell value.

nvm, got it =$B2
Crabtree's bludgeon: “no set of mutually inconsistent observations can exist for which some human intellect cannot conceive a coherent explanation, however complicated”
GNU Terry Pratchett

User avatar
Angua
Don't call her Delphine.
Posts: 5774
Joined: Tue Sep 16, 2008 12:42 pm UTC
Location: UK/[St. Kitts and] Nevis Occasionally, I migrate to the US for a bit

Re: Spreadsheet help

Postby Angua » Sat May 23, 2015 12:38 pm UTC

Ok, so I need help making a macro? (I've never made one before)

So, I have 2 samples of data per patient. They are currently arranged so they are in rows (ie 2 rows per patient). What I want is a macro that will copy the bottom row of data so that both sets of data are in the same row (because that's how I have to paste it over into another sheet).

Basically, I need something like:
If $A3 = $A2 Then Copy ([$A3-$I3])Paste ([$J2-$R2])
but I don't know how to make that into something that xcel understands.
Crabtree's bludgeon: “no set of mutually inconsistent observations can exist for which some human intellect cannot conceive a coherent explanation, however complicated”
GNU Terry Pratchett

User avatar
Whizbang
The Best Reporter
Posts: 2238
Joined: Fri Apr 06, 2012 7:50 pm UTC
Location: New Hampshire, USA

Re: Spreadsheet help

Postby Whizbang » Sat May 23, 2015 9:55 pm UTC

Something like this.

(Not tested)

Code: Select all

Private Sub MergePatients()

    Dim i As Long
   
    ActiveSheet.Sort.SortFields.Clear
    ActiveSheet.Sort.SortFields.Add Key:=Range("A1"), _
        SortOn:=xlSortOnValues, Order:=xlAscending, DataOption:=xlSortNormal
    With ActiveSheet.Sort
        .SetRange Range("A:I")
        .Header = xlYes
        .MatchCase = False
        .Orientation = xlTopToBottom
        .SortMethod = xlPinYin
        .Apply
    End With
   
    For i = 2 To ActiveSheet.Range("A65536").End(xlUp).Row
        If Range("A" & i).Value = Range("A" & i - 1).Value Then
            Range("A" & i & ":I" & i).Copy Range("J" & i - 1)
            Range("A" & i).EntireRow.ClearContents
        End If
    Next i
   
    ActiveSheet.Sort.SortFields.Clear
    ActiveSheet.Sort.SortFields.Add Key:=Range("A1"), _
        SortOn:=xlSortOnValues, Order:=xlAscending, DataOption:=xlSortNormal
    With ActiveSheet.Sort
        .SetRange Range("A:R")
        .Header = xlYes
        .MatchCase = False
        .Orientation = xlTopToBottom
        .SortMethod = xlPinYin
        .Apply
    End With
   
End Sub

User avatar
Angua
Don't call her Delphine.
Posts: 5774
Joined: Tue Sep 16, 2008 12:42 pm UTC
Location: UK/[St. Kitts and] Nevis Occasionally, I migrate to the US for a bit

Re: Spreadsheet help

Postby Angua » Sat May 23, 2015 10:12 pm UTC

Thanks!

I realised that there were exactly two values for everything, so ended up getting it to record myself moving the bottom set of values then deleting the row, copying that code a bunch of times so it did lots of rows at once, and then working my way down the sheet. Then I used conditional formatting to check to make sure I hadn't made any mistakes anywhere. However, I'm sure I'll probably have to redo something similar so I'm sure that will come in handy. I didn't get around to posting that I'd managed a work around, but thank you so much for writing that.
Crabtree's bludgeon: “no set of mutually inconsistent observations can exist for which some human intellect cannot conceive a coherent explanation, however complicated”
GNU Terry Pratchett

User avatar
WanderingLinguist
Posts: 237
Joined: Tue May 22, 2012 5:14 pm UTC
Location: Seoul
Contact:

Re: Spreadsheet help

Postby WanderingLinguist » Tue May 26, 2015 12:16 am UTC

What's in the $J2-$R2 if the condition is false? Do you need to keep the existing data there, or can it be blank if there's no match? If it can be blank, you could just do it with a conditional formula. Put it in $J2 and then just select all the cells and fill them with the same formula. Something like:

Code: Select all

  =IF($A3=$A2,A3,0)

Or if it's a string...

Code: Select all

  =IF($A3=$A2,A3,"")

You just need to make sure the cell references are relative where they need to be relative and absolute where they need to be absolute so that they fill in properly...

Or did I misunderstand what you're trying to do?

User avatar
Angua
Don't call her Delphine.
Posts: 5774
Joined: Tue Sep 16, 2008 12:42 pm UTC
Location: UK/[St. Kitts and] Nevis Occasionally, I migrate to the US for a bit

Re: Spreadsheet help

Postby Angua » Tue May 26, 2015 7:02 am UTC

Nothings in the other cells at the moment.

That certainly looks a lot easier than what I ended up doing. Thanks. I'll remember for next time.
Crabtree's bludgeon: “no set of mutually inconsistent observations can exist for which some human intellect cannot conceive a coherent explanation, however complicated”
GNU Terry Pratchett

User avatar
Angua
Don't call her Delphine.
Posts: 5774
Joined: Tue Sep 16, 2008 12:42 pm UTC
Location: UK/[St. Kitts and] Nevis Occasionally, I migrate to the US for a bit

Re: Spreadsheet help

Postby Angua » Thu May 28, 2015 6:32 pm UTC

Ok, so I have been doing the same thing over and over because it keeps bugging for some reason, but I think (finger's crossed) I finally have the data from two different datasets that I'm ready to combine). However, I'm having trouble finding a formula for what I want to do:
Basically, in columns a and b I have the patient id and date of sample for one type of test, and in columns i and j I have the same for the second type of test, and what I want to do is only count them if the id and date are the same, then I can just filter for all the ones that are 0.

Googling gave me =countif(a:a, i2; b:b, j2) but that is giving me an error?

edit, also, I don't think that formula would work anyway, as I know that all the ids are definitely in both columns, and all the dates are definitely in both columns, but it wouldn't tell me if the two pairs are the same?

I want to know if the two pairs exist anywhere in the data sets, because they aren't guaranteed to be in the same row.
Crabtree's bludgeon: “no set of mutually inconsistent observations can exist for which some human intellect cannot conceive a coherent explanation, however complicated”
GNU Terry Pratchett

User avatar
Whizbang
The Best Reporter
Posts: 2238
Joined: Fri Apr 06, 2012 7:50 pm UTC
Location: New Hampshire, USA

Re: Spreadsheet help

Postby Whizbang » Thu May 28, 2015 6:37 pm UTC

=countifs(a:a, i2; b:b, j2)

Note the "s".

I'll respond more fully after I give it a read or two.

[Edit]

Ok, so A,B and I,J should equal, yes? If so, then you want to be able to filter for just the matching items, yes?

In some column, put =A2&B2=I2&J2 and filter for any TRUE values.

Or is your AB table seperate than your IJ table? As in, your AB value could be anywhere in the IJ table, not necessarily in the same row?

If so, your countifs formula should work.

User avatar
Angua
Don't call her Delphine.
Posts: 5774
Joined: Tue Sep 16, 2008 12:42 pm UTC
Location: UK/[St. Kitts and] Nevis Occasionally, I migrate to the US for a bit

Re: Spreadsheet help

Postby Angua » Thu May 28, 2015 7:01 pm UTC

It's the second one - they're two separate tables with differing numbers of samples per patient.

For some reason excel is still thinking there's a problem with the formula? I did copypasta so isn't the lack of an s (thanks by the way, would probably not have noticed it without you pointing it out).
Crabtree's bludgeon: “no set of mutually inconsistent observations can exist for which some human intellect cannot conceive a coherent explanation, however complicated”
GNU Terry Pratchett

User avatar
Whizbang
The Best Reporter
Posts: 2238
Joined: Fri Apr 06, 2012 7:50 pm UTC
Location: New Hampshire, USA

Re: Spreadsheet help

Postby Whizbang » Thu May 28, 2015 7:03 pm UTC

What version of Excel are you running?

Try

=SUMPRODUCT(--(A:A=I2),--(B:B=J2))

User avatar
Angua
Don't call her Delphine.
Posts: 5774
Joined: Tue Sep 16, 2008 12:42 pm UTC
Location: UK/[St. Kitts and] Nevis Occasionally, I migrate to the US for a bit

Re: Spreadsheet help

Postby Angua » Thu May 28, 2015 7:06 pm UTC

It's excel 2013.

That sumproduct one is working, thanks so much!!!
Crabtree's bludgeon: “no set of mutually inconsistent observations can exist for which some human intellect cannot conceive a coherent explanation, however complicated”
GNU Terry Pratchett

User avatar
Whizbang
The Best Reporter
Posts: 2238
Joined: Fri Apr 06, 2012 7:50 pm UTC
Location: New Hampshire, USA

Re: Spreadsheet help

Postby Whizbang » Thu May 28, 2015 7:08 pm UTC

Ah, found the problem with the countifs. Replace the semicolon with a comma.

=countifs(a:a, i2, b:b, j2)


Return to “The Help Desk”

Who is online

Users browsing this forum: No registered users and 10 guests