Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Customer statistics with a volatility greater than the required volatility #1

Open
khainguyen2003 opened this issue Jan 14, 2025 · 0 comments

Comments

@khainguyen2003
Copy link

Can someone help me solve this problem?

  • Each month, there will be about 24 million rows of customer spending data.
  • My task is to generate a list of CIFs where the total volatility for a user-specified column (pi_colName) exceeds the user-specified threshold (pi_min_volatility) from a reference date (pi_rdate) to the specified date (pi_date).

Requirements:

  • The query must execute in under 1 minute.
  • Indexing is not allowed because it would require too much memory to store indexes for the new data added each month.
  • I am using Oracle Database.
    Thank you very much for your help!

Here is my table

CREATE TABLE "VOLATILITY_REPORT_041_DBTK" 
   (	"ID" NUMBER GENERATED ALWAYS AS IDENTITY MINVALUE 1 MAXVALUE 9999999999999999999999999999 INCREMENT BY 1 START WITH 1 CACHE 20 NOORDER  NOCYCLE  NOKEEP  NOSCALE , 
	"PERIOD" DATE, 
	"BRANCH" VARCHAR2(4 BYTE), 
	"CURTYPE" VARCHAR2(20 BYTE), 
	"CIF" VARCHAR2(20 BYTE), 
	"ACCTNO" VARCHAR2(50 BYTE), 
	"CNAME" VARCHAR2(250 BYTE), 
	"E_SECTOR" VARCHAR2(250 BYTE), 
	"TERM" VARCHAR2(250 BYTE), 
	"TG_USD_3" NUMBER(20,2), 
	"GTCG_USD_6" NUMBER(20,2),
	"UNIT" VARCHAR2(20 BYTE) DEFAULT 'Thounsand'
   ) ;

  ALTER TABLE "VOLATILITY_REPORT_041_DBTK" MODIFY ("ID" NOT NULL ENABLE);

Here is my code, but it takes around 2 minutes to run:

PROCEDURE VOLATILITY_REPORT_041_DBTK
(   pi_colName         IN VARCHAR2,
    pi_date           IN VARCHAR2,
    pi_rdate   IN VARCHAR2,
    pi_branch            IN VARCHAR2,
    pi_min_volatility        IN NUMBER,

    PI_Start                IN NUMBER,
    PI_End                  IN NUMBER,
    -- PARAM OUT
    PO_TotalRow             OUT NUMBER,
    -- DEFAULT
    PO_ErrorCode            OUT VARCHAR2,
    PO_ErrorDesc            OUT VARCHAR2,
    PO_Cursor               OUT SYS_REFCURSOR
) AS 
  l_upper_colNam VARCHAR2(100);
    l_sql          CLOB;
    l_count_sql    CLOB;
    l_exits        NUMBER;
    table_exists NUMBER;
BEGIN
    
    SELECT COUNT(*) INTO table_exists
    FROM USER_TABLES
    WHERE TABLE_NAME = 'DAT_VOLATILITY_REPORT_041_DBTK_TEMP';
    IF table_exists > 0 THEN
        EXECUTE IMMEDIATE 'DROP TABLE DAT_VOLATILITY_REPORT_041_DBTK_TEMP';
    END IF;

    l_sql := '
    CREATE TABLE DAT_VOLATILITY_REPORT_041_DBTK_TEMP AS
    WITH 
    FULL_THIS_TERM_DAT AS (
        SELECT
            A.CIF,
            A.BRANCH,
            A.ACCTNO,
            A.CURTYPE,
            A.E_SECTOR,
            A.UNIT,
            NVL(TO_NUMBER(A.' || l_upper_colNam || '), 0) AS THIS_TERM_DAT
        FROM VOLATILITY_REPORT_041_DBTK A
        WHERE A.PERIOD = TO_DATE(''' || pi_date || ''', ''DD/MM/YYYY'')
            AND (' || CASE WHEN pi_branch IS NULL OR pi_branch = '' THEN '1=1' ELSE 'A.BRANCH = ''' || pi_branch || '''' END || ')
    ),
    FULL_PREV_TERM_DAT AS (
        SELECT
            B.CIF,
            B.BRANCH,
            B.ACCTNO,
            NVL(TO_NUMBER(B.' || l_upper_colNam || '), 0) AS PREV_TERM_DAT
        FROM VOLATILITY_REPORT_041_DBTK B
        WHERE B.PERIOD = TO_DATE(''' || pi_rdate || ''', ''DD/MM/YYYY'')
            AND (' || CASE WHEN pi_branch IS NULL OR pi_branch = '' THEN '1=1' ELSE 'B.BRANCH = ''' || pi_branch || '''' END || ')
    ),
    THIS_TERM_DAT AS (
        SELECT
            CIF,
            SUM(THIS_TERM_DAT) AS THIS_TERM_DAT
        FROM
            FULL_THIS_TERM_DAT
        GROUP BY CIF
    ),
    PREV_TERM_DAT AS (
        SELECT
            CIF,
            SUM(PREV_TERM_DAT) AS PREV_TERM_DAT
        FROM FULL_PREV_TERM_DAT
        GROUP BY CIF
    ),
    VARIATION AS (
        SELECT
            KN.CIF,
            NVL(KN.THIS_TERM_DAT, 0) AS TONG_THIS_TERM_DAT,
            NVL(KT.PREV_TERM_DAT, 0) AS TONG_PREV_TERM_DAT,
            ABS(NVL(KN.THIS_TERM_DAT, 0) - NVL(KT.PREV_TERM_DAT, 0)) AS pi_min_volatility
        FROM
            THIS_TERM_DAT KN
            LEFT JOIN PREV_TERM_DAT KT ON KN.CIF = KT.CIF
        WHERE
            ABS(NVL(KN.THIS_TERM_DAT, 0) - NVL(KT.PREV_TERM_DAT, 0)) >= NVL(' || pi_min_volatility || ', 0)
    ),
    DATA_CHI_TIET AS (
        SELECT
            A.CIF,
            A.BRANCH,
            A.ACCTNO,
            A.CURTYPE,
            A.E_SECTOR,
            A.UNIT,
            NVL(THIS_TERM_DAT, 0) THIS_TERM_DAT,
            NVL(PREV_TERM_DAT, 0) PREV_TERM_DAT,
            ABS(NVL(THIS_TERM_DAT, 0) - NVL(PREV_TERM_DAT, 0)) AS pi_min_volatility
        FROM
            FULL_THIS_TERM_DAT A
            LEFT JOIN FULL_PREV_TERM_DAT B ON
                A.CIF = B.CIF AND
                A.ACCTNO = B.ACCTNO
        WHERE
            A.CIF IN (SELECT CIF FROM VARIATION)
    )
    SELECT
            D.*,
            ROW_NUMBER() OVER (ORDER BY D.pi_min_volatility DESC, D.CIF) AS RN
        FROM
            DATA_CHI_TIET D';
    DBMS_OUTPUT.put_line(l_sql);
    EXECUTE IMMEDIATE l_sql;

    OPEN PO_Cursor FOR 'SELECT * FROM DAT_VOLATILITY_REPORT_041_DBTK_TEMP WHERE RN BETWEEN '||PI_Start||' AND '||PI_End;

    l_count_sql := 'SELECT COUNT(*) FROM DAT_VOLATILITY_REPORT_041_DBTK_TEMP';
    EXECUTE IMMEDIATE l_count_sql INTO PO_TotalRow;
    
    -- EXECUTE IMMEDIATE 'DROP TABLE DAT_VOLATILITY_REPORT_041_DBTK_TEMP';
  PO_ErrorCode := 200;
  PO_ErrorDesc := 'SUCCESS';

  EXCEPTION
  WHEN OTHERS THEN
      PO_ErrorCode := SQLCODE;
      PO_ErrorDesc := SQLERRM;
  END VOLATILITY_REPORT_041_DBTK;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant