Post

3 followers Follow
0
Avatar

Question regarding DPU cost -

User account - akhilalankala

If I have scenario where I need to get tweets for a given criteria for 2 different sets of queries and tag them individually, I see that I can do it in 2 ways -

Method 1 -
1. Generate individual hash for each query -
streamHash1= 88f2ad360cbe8ef083ac24e061756b18
streamHash1 hash dpu = 0.7
streamHash2 = 20b667658d9c0b900ae40fbdd43dec40
streamHash2 hash dpu = 0.9

  1. Tag each stream - result = tag "C1" { stream "88f2ad360cbe8ef083ac24e061756b18" } tag "C2" { stream "20b667658d9c0b900ae40fbdd43dec40" } return { stream "88f2ad360cbe8ef083ac24e061756b18" or stream "20b667658d9c0b900ae40fbdd43dec40" } masterHash = 637dc0ff3683a675899f0d40c74f4c4b master hash dpu = 1.6

Method2 -
Generate tagging using the search query as the tag itself and have the same query as the filter criteria =
tag "C1"{((twitter.text contains_any "marketing analytics,demand generation,Consumer Analytics,customer acquisition,conversion rate optimization,customer segmentation,content optimization,cross channel personalization,consumer behavior,interactive marketing")) and ((not twitter.text contains_any "careers,recruit,jobs,career,job,recruitment,hiring,hire")) and (twitter.lang == "en") and (demographic.gender contains_any "male,female,mostly_male,mostly_female")}

tag "C2"{((twitter.text contains_any "search marketing,Torso Term,pay per click,content strategy,content management,content lifecycle,user acquisition,Longtail,conversion optimization,Bid Management,search advertising,Ebusiness,direct response marketing,Keyword expansion,Keyword research,Global SEO,Head Term,keyword strategies,account restructure,advertising creation,media management,ad budget,web content,paid search,long-tail")) and (twitter.lang == "en") and (demographic.gender contains_any "male,female,mostly_male,mostly_female")}

return {( ((twitter.text contains_any "marketing analytics,demand generation,Consumer Analytics,customer acquisition,conversion rate optimization,customer segmentation,content optimization,cross channel personalization,consumer behavior,interactive marketing")) and ((not twitter.text contains_any "careers,recruit,jobs,career,job,recruitment,hiring,hire")) and (twitter.lang == "en") and (demographic.gender contains_any "male,female,mostly_male,mostly_female") ) or ( ((twitter.text contains_any "search marketing,Torso Term,pay per click,content strategy,content management,content lifecycle,user acquisition,Longtail,conversion optimization,Bid Management,search advertising,Ebusiness,direct response marketing,Keyword expansion,Keyword research,Global SEO,Head Term,keyword strategies,account restructure,advertising creation,media management,ad budget,web content,paid search,long-tail")) and (twitter.lang == "en") and (demographic.gender contains_any "male,female,mostly_male,mostly_female"))}

masterHash = bb21c56d94620996a5c1438c8e5cdf54
master hash dpu = 1.4

Questions -
1. In case of Method1, would be charging us for DPU = 0.7 + 0.9 + 1.6 or is it just 0.7 + 0.9
2. If I use Method2, will it result in the same data as Method1 and the charge is just 1.4.

Which of the 2 methods would you suggest.

shirish

Please sign in to leave a comment.

3 comments

0
Avatar

method 1 charges only for the master stream dpu of 1.6

Method 2 results in the same data as long as the csdl statements you are using are the same as the other queries.

Personally, I'd use method 2 because you have the opportunity to consolidate logic in the tagging portion. You could make it so you only require demographic.gender and twitter.lang once, in the return statement, thus lowering the dpu cost further.

jbreucop 0 votes
Comment actions Permalink
0
Avatar

Sorry, when you say "method 1 charges only for the master stream dpu of 1.6" - can you be more clear on if you charge for 2 inidividual stream hashes
only and not for the master hash separately again

streamHash1= 88f2ad360cbe8ef083ac24e061756b18
streamHash1 hash dpu = 0.7
streamHash2 = 20b667658d9c0b900ae40fbdd43dec40
streamHash2 hash dpu = 0.9

shirish 0 votes
Comment actions Permalink
0
Avatar

When using method 1, the DPU cost of the master stream would be a sum of the DPU costs of any 'sub-streams', and any logic contained in the master stream.
If you call the /dpu API endpoint for each of these stream hashes, you can see exactly what you are being charged for: http://dev.datasift.com/docs/api/1/dpu

Jason D. 0 votes
Comment actions Permalink